https://dagster.io/ logo
#integration-dbt
Title
# integration-dbt
g

geoHeil

04/06/2023, 12:24 PM
How can I change the asset key which is created from dagster-dbt? I know there is the key prefix and this is applied nicely (to both sources and models of the dbt project). However, for the sources a middle key space is created as well (wrehouse (prefix) / middle key (only sources) / table name. How can I force such a middle key (schema) for the models as well?
🤖 1
for the sources: source_key_prefix
For the models: in principle I could hard-code the dbt target schema here. But I wold prefer to somehow retrieve it from dagster (not specifying it twice). Is there an option for this?
I would not really want to use an env var rather have dagster extract it from dbt
The usecase is: this target schema might change for normal vs. banch deployments
Currently, I am using:
dbt_target_schema = os.environ.get("<<env>>", "<<default>>")
q

Qwame

04/06/2023, 1:00 PM
I believe you can use the
node_info_to_asset_key
argument to customize the asset names in loading dbt assets
t

Thomas Weit

04/06/2023, 1:04 PM
As Qwame already mentioned, the node_info_to_asset_key parameter is what you are looking for. You can apply a custom function to it, like this:
Copy code
def get_dbt_source_database_name(node_info: Mapping[str, Any]):
    if node_info["resource_type"] == "source":
        components = [node_info["database"], "BigQuery", node_info["source_name"], node_info["name"]]
    else:
        configured_schema = node_info["config"].get("schema")
        if configured_schema is not None:
            components = [configured_schema, node_info["name"]]
        else:
            components = [node_info["name"]]

    return AssetKey(components)

dbt_assets = load_assets_from_dbt_project(
    DBT_PROJECT_DIR,
    DBT_PROFILES_DIR,
    key_prefix=[GOOGLE_CLOUD_PROJECT, "BigQuery", DBT_DATASET],
    node_info_to_asset_key=get_dbt_source_database_name
)
In this process, I modify the asset keys for source models by adding the
node_info["database"]
from the sources.yml file as well the string
BigQuery
to the prefix.
g

geoHeil

04/06/2023, 1:24 PM
interesting - I am connecting to Oracle - but it seems to be all NULL/ None / NoneType there for the database: Member of sequence mismatches type. Expected <class 'str'>. Got None of type <class 'NoneType'>. in manifest.json
13 Views