Lorenzo01/17/2023, 10:37 AM
and it does it for each job, it starts over again every time. Seems like it is checking what models of my project have alreasy been materialized and which ones are still never materialized, how can I avoid this time-consuming behaviour? Thanks in advance! 👀
"dbt ls --output json ... --select model: xyz "
Jonathan Neo01/17/2023, 11:48 AM
dbt_assets = load_assets_from_dbt_project( project_dir=DBT_PROJECT_PATH, profiles_dir=DBT_PROFILES, key_prefix=["jaffle_shop"] )
Lorenzo01/17/2023, 12:05 PM
TMP_UM_CS_S_CHANNEL_P_AU_NEW_RECORDS_asset = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select="TMP_UM_CS_S_CHANNEL_P_AU_NEW_RECORDS") TMP_UM_PO_B_PO_HEADER_NEW_RECORDS_asset = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select="TMP_UM_PO_B_PO_HEADER_NEW_RECORDS") TMP_UM_PO_B_PURCHASE_ORDER_DETAIL_NEW_RECORDS_asset = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select="TMP_UM_PO_B_PURCHASE_ORDER_DETAIL_NEW_RECORDS") TMP_UM_QU_B_QUESTIONNAIRE_DETAIL_P_AU_NEW_RECORDS_asset = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select="TMP_UM_QU_B_QUESTIONNAIRE_DETAIL_P_AU_NEW_RECORDS") TMP_UM_QU_B_QUESTIONNAIRE_HEADER_P_AU_NEW_RECORDS_asset = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select="TMP_UM_QU_B_QUESTIONNAIRE_HEADER_P_AU_NEW_RECORDS") TMP_UM_QU_S_QUESTIONNAIRE_ANSWER_P_AU_NEW_RECORDS_asset = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select="TMP_UM_QU_S_QUESTIONNAIRE_ANSWER_P_AU_NEW_RECORDS") TMP_UM_SH_B_SHOP_GOLIVES_P_AU_NEW_RECORDS_asset = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select="TMP_UM_SH_B_SHOP_GOLIVES_P_AU_NEW_RECORDS")
Jonathan Neo01/17/2023, 12:13 PM
to load all my dbt assets. If you want to specify certain dbt models only, you could do:
load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR, select="model_1 model_2 model_3 model_4")
would trigger multiple
commands, and therefore take a long time to execute.
Lorenzo01/17/2023, 12:18 PM
Jonathan Neo01/17/2023, 12:55 PM
. What the reconciliation sensor does is that it materializes only model_4 when model_3 is fixed. I have an example here in a toy project: https://github.com/jonathanneo/my-dbt-dagster/blob/578ff10b9c1a4478f5e5462e7aa5d3ff2a4e07e7/stargazer/assets_modern_data_stack/my_asset.py#L97-L99
Lorenzo01/17/2023, 1:37 PM
Adam Bloom01/17/2023, 3:16 PM
loader instead of the one you’re currently using: https://docs.dagster.io/_apidocs/libraries/dagster-dbt#dagster_dbt.load_assets_from_dbt_manifest This requires you to run
yourself (I.e. during your user code deployment container build) and then reuses the output for every dbt asset.
Lorenzo01/18/2023, 1:17 PM
for each and every asset during my run. Keep in mind that I imported every model as a singular asset to be able to restart the DAGs with maximum granularity. It looks a bit strange, because it does this command for each asset during the import of the code, and then it repeats the same thing for each asset (again) when I run a DAG. Thank you! yay
/usr/bin/python3 /home/lorenzo/.local/bin/dbt --no-use-color --log-format json ls --project-dir /home/lorenzo/Documents/GitHub/dagster-dbt-test/dbt_python_assets/dbt_python_assets/../UM_FOX_AU-dbt/dbt --profiles-dir /home/lorenzo/Documents/GitHub/dagster-dbt-test/dbt_python_assets/dbt_python_assets/../UM_FOX_AU-dbt/dbt/config --select TMP_UM_SH_B_SHOP_HIERARCHY_P_AU_UPDATE --output json
Qwame01/18/2023, 5:52 PM
command for any asset that I materialize, even if it's not a dbt asset.
Adam Bloom01/18/2023, 5:57 PM
- see my comment above
Qwame01/18/2023, 5:59 PM
on each asset materialization.
Adam Bloom01/18/2023, 6:00 PM
is invoked. you won't see it happening on each startup with
Qwame01/18/2023, 6:02 PM
owen01/18/2023, 6:05 PM
will allow you to execute any subset of dbt models, so loading each model as a separate call is not recommended and doesn't have a real benefit.
, that means that in order to load your repository code, dagster will need to run
(there's no way to load just the subset of the repository that is unrelated to dbt). I'd definitely endorse @Adam Bloom’s suggestion of using
for this case.
Lorenzo01/19/2023, 8:50 AM