All dbt assets are running in a single step
# integration-dbt
t
All dbt assets are running in a single step
Hi, Perhaps this isn’t how the dbt assets are intended to work, or perhaps I am just missing something somewhere. Either way, I suspect this will be a simple answer. So I have loaded all of my dbt assets into Dagster (code snippet below). I am essentially creating asset groups (and subsequent jobs) based on the dbt
tags
. The attached screenshot is an example of one of the asset groups / jobs that will run. However, when we run the job (either via scheduled run or a manual run), I see that Dagster is running all of the assets in a single run - which is not what I expected. I was expecting that each asset would be materialized in it’s own run, therefore allowing us to re-run only individual steps in the event of failures etc etc. I presume this is a user error. Any guidance would be much appreciated Running on Dagster 1.4.4
Copy code
assets = with_resources(
    load_assets_from_dbt_project(
        profiles_dir=DBT_PROJECT_PATH,
        project_dir=DBT_PROFILES,
        # Use the first dbt tag as the asset group
        node_info_to_group_fn=lambda node: node["config"]["tags"][0]
        if node["config"]["tags"] != []
        else None,
        display_raw_sql=True,
        exclude="tag: integration_tests unit-tests",
    ),
    {
        "dbt": dbt_cli_resource.configured(
            {
                "project_dir": DBT_PROJECT_PATH,
                "profiles_dir": DBT_PROFILES,
            },
        )
    },
)
We then create the job as so:
Copy code
define_asset_job(name="dbt_tag_here", selection=AssetSelection.groups("dbt_tag_here"))
r
This is intended. We don’t separate out the execution of dbt models into separate steps (e.g.
dbt run --select asset1
,
dbt run --select asset2
, … Like you pointed out, we will run
dbt run --select asset1 asset2
).
t
Oh ok. Thanks, @rex. That’s a bummer, as I thought that this wasn’t the case (especially based off of the UI graph). Are there any plans to make these individual executable steps when the entire job is run?
IMO, this would be a big leap forward
I guess the idea though is that in the event of a failure of e.g. 3/20 models, that this shows up in the UI and we can manually select just those assets for re-materialization?
r
I don’t believe we’re planning to make these individual executable steps. Our framework coalesces the work that needs to be done, so that it can all be executed in one step. This is actually more efficient since it saves on process initialization cost, step startup cost, etc. You should be able to just rematerialize failed models from the UI — you don’t need to manually select them. cc @owen if this functionality is hidden somewhere?
t
This is actually more efficient since it saves on process initialization cost, step startup cost, etc.
That makes sense from that POV.
You should be able to just rematerialize failed models from the UI — you don’t need to manually select them.
Ah ok. I naively presumed that doing this would re-run the same command (ie. selecting all 20 assets). Not just the 3 failed ones. Great to know. Thanks for taking the time to clarify this for me.
r
Just to clarify: you have to click the dropdown menu next to
Materialize all
in your job view to see the screenshot that I posted. In the runs view (your screenshot) I believe you are correct: we re-run the same command from failure. I think your suggestion makes sense here. We could probably add the same button to Re-execute failed assets from the runs page, similar to the jobs page. Will double check with the team
t
Awesome. Thanks again, Rex 🙂
q
I added this feature myself in the event of a failure. The command to execute failed and skipped models is logged and I can copy and execute this myself. However, I agree that it'd be a great addition if we could just retry from failure like with all other asset runs.
r
Mind filing a feature request for this?
Looks like there’s an existing issue: https://github.com/dagster-io/dagster/issues/12423
t
Cheers. Gave it a thumbs up
👍🏽 1