I'm using Dagster+DBT, where my DBT models belong ...
# ask-community
r
I'm using Dagster+DBT, where my DBT models belong to different subdirectories. in Dagster land, this translates to each asset having a compound
AssetKey
of
(subdir, model_name)
unfortunately I can't seem to reference these DBT models in my Python assets. for example, the following code breaks:
Copy code
dbt_assets = load_assets_from_dbt_project(
    project_dir=DBT_PROJECT_DIR, io_manager_key="bq_io_manager"
)

# my_report is a DBT model in the `gold` models subdir
# the generated AssetKey is AssetKey(["gold", "my_report"])

@asset(compute_kind="python")
def publish_my_report_gcs(gold__my_report: pd.DataFrame) -> None:
    return
I'm referencing the DBT asset in Python following this method error below:
Copy code
ERROR tests/test_defs.py - dagster._core.errors.DagsterInvalidDefinitionError: Input asset '["gold__my_report"]' for asset '["publish_my_report_gcs"]' is not produced by any of the provided asset ops and is not one of the provid...
this error is getting thrown here - but in that method, I don't see where
upstream_key
is getting checked against the Python identifier for each upstream asset. I only see the standard
__hash__
and equality checks defined here am I doing something wrong? thank you!
note: I can fix this with
Copy code
@asset(
    compute_kind="python",
    ins={"my_report": AssetIn(key=AssetKey(["gold", "my_report"]))},
)
def publish_my_report_gcs(my_report: pd.DataFrame) -> None:
    return
but ideally I can have the implicit behavior without
ins
working, like it does without compound key Assets
j
hey @Rob Sicurelli dagster will figure out the corresponding asset if you provide the final piece of the full asset key. for example this works:
Copy code
from dagster import asset, Definitions

@asset(
    key_prefix=["gold"]
)
def my_report():
    # mocking out the dbt asset
    return 1

@asset
def publish_report_to_gcs(my_report: int):
    return my_report + 1

defs = Definitions(
    assets=[my_report, publish_report_to_gcs]
)
however if you have another asset called
my_report
then dagster won’t be able to figure out which
my_report
to use. in that case you would need to do the explicit
ins
r
@jamie sorry I'm a bit confused (and new to Dagster). my DBT assets have generated Dagster
AssetKeys
like
AssetKey(["dbt_model_subdir", "dbt_model_name"])
they're being pulled in to the Python module via
load_assets_from_dbt_project
how do I reference this asset in the Python asset? simply using
dbt_model_name
throws an exception
to clarify ... how do I reference the full input asset key without using
@asset(ins=...)
?
j
huh that should work as far as i know. can you try adding the same key_prefix to your downstream python asset? so like
Copy code
@asset(
   key_prefix=["gold"]
)
def publish_report_to_gcs(my_report):
    ...
also not sure if you’ve seen this but we have a dbt focused tutorial https://docs.dagster.io/integrations/dbt/using-dbt-with-dagster
r
I ran through the tutorial, I don't think this is covered there. and in this case, the downstream Python asset has a different asset key prefix; it belongs to a different asset group. why can't I use this? seems like the perfect use case