Hi ! I am doing a POC for a datawarehouse consisti...
# integration-dbt
t
Hi ! I am doing a POC for a datawarehouse consisting of python scripts for data ingestion, dbt for transformations and dagster for orchestration. I have imported my dbt models into dagster using template code: # Import all dbt models @dbt_assets(manifest=dbt_manifest_path) def All_dbt_assets(context: AssetExecutionContext, dbt: DbtCliResource): yield from dbt.cli(["build"], context=context).stream() I would like to create a "define_asset_job" which takes my python scripts (decorrated with @asset) as well as a small selection of my dbt models. How do i achieve this? Basically I want my dbt model, and its source dependecies, to run downstream from my python scripts. Is my setup wrong or am i missing the obvious? :)
s
Hi Thomas, you've got a couple of options: • you can specify what assets you want to materialise within the job explicitly • you can specify a particular asset, then indicate whether you want upstream/downstream dependencies of that to run (see here - https://docs.dagster.io/concepts/assets/asset-selection-syntax#selecting-downstream-dependencies) In order to do that, you need to know what asset keys you're interested in, presumably your code is running and you can see the assets in the UI. In the example in my screenshot, the asset key is behind the orange boxes lets say its: •
keypart1 / keypart2
You can specify this in an asset job by doing
Copy code
job_def = define_asset_job("myjobname",
                           selection=["keypart1/keypart2"])
That'll create a job with that asset selected. You can then tweak that selection as appropriate
t
Thanks Steven, this is basically what I needed. Still having some issues with my asset jobs though.. I created a asset job for a set of dbt models. But i would like to specify an upstream dependency from my dbt source to my python script (which loads data and therefore should be materialized before my dbt models)
s
try this - https://docs.dagster.io/integrations/dbt/reference#upstream-dependencies now that you know your asset keys, you can specify the asset key in the dbt
sources.yml