https://dagster.io/ logo
#integration-dbt
Title
# integration-dbt
r

Robele Baker

02/20/2024, 2:21 PM
Hey - I need help. I have a fully working @dbt_assets set of assets in Dagster. In another package in the exact location, I call one DBT model an upstream dependency of another asset using the
get_asset_key_for_model
function. When I define all my assets, I receive duplicate key violations.
Copy code
all_assets = [
    *dbt_assets,
    *upstream_dbt_assets,
    *downstream_dbt_assets,
]
When I Python Debug DBT, it looks like the
downstream_dbt_assets
job includes all assets in
dbt_assets
. Does anyone know if this is immediately an issue with my asset selection in
all_assets
or if there is something wrong with my implementation of
downstream_dbt_assets
? I can provide more colour as needed.
I have a solution that feels... wrong. Just remove
*dbt_assets
from the
all_assets
list. However, it just feels like I'm solving a sympton.
r

rex

02/20/2024, 2:33 PM
Are you using
load_assets_from_package_module
? There might be an interaction with: •
get_asset_key_for_model
(which requires a
dbt_assets
definition) • And
load_assets_from_package_module
which loads all assets from a package module If you imported
dbt_assets
in your package module that defines your downstream dbt assets,
load_assets_from_package_module
will pick up both of them.
r

Robele Baker

02/20/2024, 2:42 PM
Hey @rex - yes, I am, and that would make sense. What would be the best practices approach? I don't like my workaround at all.
r

rex

02/20/2024, 2:50 PM
You could define all the keys for your downstream PRs in a separate module not that's not a submodule to your
downstream_dbt_assets
module: e.g.
Copy code
# asset_keys.py
my_downstream_asset_key = get_asset_key_for_model([dbt_project_assets], "my_downstream_asset_key")
Copy code
# downstream_dbt_assets.py
from ..asset_keys import my_downstream_asset_key

@asset(deps=[my_downstream_asset_key]
def my_downstream_asset(...):
     ...
r

Robele Baker

02/20/2024, 3:43 PM
Hmm no luck - my upstream asset is being used as a parameter in the next job. I ran into the same problem with this implementation
Copy code
from ..asset_keys import my_downstream_asset_key

@asset(
    ins={
        "useful_dbt_output": AssetIn(
            input_manager_key="bigquery_pandas_io_manager",
            key=my_downstream_asset_key,
        )
    },