Marko Obert
03/07/2024, 12:08 PMScout
03/07/2024, 12:08 PMpython from dagster_duckdb import DuckDBResource # Configure the DuckDB resource duckdb_resource = DuckDBResource( database="path/to/my_duckdb_database.duckdb", # required )
Then, you would define an op that uses this resource: python from dagster import op @op(required_resource_keys={"duckdb"}) def my_duckdb_op(context): # Get a connection from the DuckDB resource with context.resources.duckdb.get_connection() as conn: # conn is a duckdb.DuckDBPyConnection # You can now use this connection to execute SQL queries conn.execute("YOUR SQL QUERY HERE")
Finally, you would include the resource in your repository definition: python from dagster import job, repository @job(resource_defs={"duckdb": duckdb_resource}) def my_job(): my_duckdb_op() @repository def my_repository(): return [my_job]
In this setup, the my_duckdb_op
op requires the duckdb
resource, which is provided to the job through the resource_defs
argument. When the op is executed, it uses the DuckDB connection to perform database operations. Please note that while this example shows how to use a DuckDB resource within an op, we encourage the use of assets when working with data in Dagster, as they provide a more declarative and modular approach to data orchestration.