Martin O'Leary
08/10/2022, 8:26 AMdagster-dbt
use case.
Few more questions:
1. I've split up my job by making multiple calls to load_assets_from_dbt_project
and using the select
parameter to select models upstream and downstream of a python asset. Is there a way to give these resulting ops more meaningful names other than `run_dbt_<project_name>_<hash>`` ?
2. I have variables that I use in my DBT project - is there a pattern I can use to have the user be required to enter these as config when running a job via dagit? I know I can pass them to dbt_cli_resource.configured
but I want to have the user be required to enter these or see defaults like with an op. I also would like to not have to specify the project and profile parameters this way though. I want those to either be defaults that nobody needs to touch or added via non-runtime configowen
08/10/2022, 5:01 PMowen
08/10/2022, 5:07 PM.configured
, you can pass in a config function, which basically defines a new config schema (in your case, it seems like you'd want that schema to be {"vars": Permissive()}
or something like that). Then, you can hard-code stuff that you wouldn't want to change (such as project dir / profiles dir)owen
08/10/2022, 5:07 PMMartin O'Leary
08/16/2022, 10:10 AMload_assets_from_dbt_project
and used the select
arg to specify which models should be included in that group of assets and as you suggested I let dagster/dbt do the thinking for me and replaced it with one call to load_assets_from_dbt_project
and select="*"
The issue I now have is that I have an operation disconnected from the rest and it seems to run in parallel (and disconnected) even though other models have these as dependencies. Here I've shown the operation graph, the flat chart of the execution and asset graph that highlights the issue. The asset graph highlights the 3 seeds and 2 models which have downstream dependencies and are part of the single (disconnected) operation in the other 2 graphs. 🤷♂️Martin O'Leary
08/16/2022, 10:35 AMMartin O'Leary
08/16/2022, 10:35 AMMartin O'Leary
08/17/2022, 10:19 AMload_assets_from_dbt_project
the dependency seems to hold (operations execute in the right order)owen
08/17/2022, 5:55 PMowen
08/17/2022, 5:56 PMowen
08/17/2022, 5:57 PMowen
08/17/2022, 5:58 PMowen
08/17/2022, 5:59 PMMartin O'Leary
08/17/2022, 8:26 PM..dbt_2
finishes before the ...dbt
(which depends on it)
As they execute "live" though it always shows ...dbt_2
finishing last 🤷♂️
It could be a dagit only thing which is fine because the asset graph shows the correct connections and dependencies are all connectedowen
08/17/2022, 8:28 PM_2
step?owen
08/17/2022, 8:28 PMMartin O'Leary
08/17/2022, 8:32 PM...2
step are a bunch of seed files and single downstream models which operate on them before they are fed into the models in the ...dbt
jobMartin O'Leary
08/17/2022, 8:35 PMload_assets_from_dbt_project
, the 3 separate dbt operations show the same inputs (9 sources) and same outputs (31 models) on the information on the right of the screen when I click on themowen
08/17/2022, 8:35 PMrun_dbt_mca_backtesting_dbt
step (the -o-
icon in the middle of the asset), rather than the ..._2
step. this could be a dagit issue, but I want to double checkowen
08/17/2022, 8:36 PMowen
08/17/2022, 8:38 PMMartin O'Leary
08/17/2022, 8:41 PM...2
operation are these (that's changed slightly because I was missing the python computed asset in previous screenshot as I'm changing my setup on the fly here 🙂 ) and they are all materialized as part of the ...dbt
operationMartin O'Leary
08/17/2022, 8:43 PMowen
08/17/2022, 8:44 PM