Hello, is it possible to use multiple dbt projects...
# ask-community
t
Hello, is it possible to use multiple dbt projects in one dagster repository? I can call load_assets_from_dbt_project multiple times to create the assets but how do I configure my dbt resource to use different project_dir path for the execution of the dbt op?
a
since project_dir is part of the resource configuration, you'd probably just want to create multiple dbt resources and pass different resources to different assets
t
Hello Adam, what would be the trick to pass a different resource ? I don't see any parameters to change the resource name on load_assets_from_dbt_project ?
a
ah, I haven't migrated to assets from ops/jobs yet - but it looks even easier! profiles_dir is an optional argument to
load_assets_from_dbt_project
, so you can just pass it directly! https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-dbt/dagster_dbt/asset_defs.py#L377-L379
t
Yes indeed, it's used to invoke
dbt ls
and generate the manifest. But once the assets are generated, dagster is creating the dbt op in the background. That op is expecting a dbt resource that I can't change (see https://github.com/dagster-io/dagster/blob/9c481e193a1eac31b79cf4b8f8d3c38eac32ef3[…]/python_modules/libraries/dagster-dbt/dagster_dbt/asset_defs.py).
There will be two generated op (one for each dbt project), but both are expecting the same resource. I don't see a way to change the parameters that are passed to the resource.
a
how are you passing the resource to the op? at least in a pre-assets world, you'd do something like this:
Copy code
@job(
        resource_defs={
            "dbt": dbt_configured,
})
and that's where you could have
dbt_configured_a
and
dbt_configured_b
, with both being referred to as
dbt
within the op context
sorry, not super familiar with assets yet myself, but hopefully that's a useful direction to look
t
Yes this is where I am stuck. I have :
# Configure DBT
dbt_resource_config = dbt_cli_resource.configured(
{
"project_dir": DBT_PROJECT_A_DIR,
"profiles_dir": DBT_PROFILES_DIR,
}
)
resource_defs = {
"dbt": dbt_resource_config,
}
@repository
def mds_repo():
return with_resources(
load_assets_from_current_module(),
resource_defs=resource_defs,
) + [define_asset_job("all")]
But it won't work for one of the generated Op.
a
I'll defer to someone who knows asset jobs better. alternatively, you could split this into multiple repositories
t
Thanks for the help Adam. Splitting into multiple repositories would be my next step. My concern is that I won't be able to create a job that spans accross the two projects (one job to generate all the assets).
c
Hey Timothee - so bad news is that we don't currently have a way to remap the same resource key to use two different resources within the same job. We have an issue to track that here: https://github.com/dagster-io/dagster/issues/2112. Unfortunately, we also currently have the restriction of needing to use all the same resources for a given key in a repo, so you would be forced to use multiple repositories (which does mean a different job for each repo). There's a workaround you can use to get them to run together though; you can have the downstream dbt project trigger when the upstream project runs via an asset sensor: https://docs.dagster.io/concepts/partitions-schedules-sensors/sensors#asset-sensors - and use a SourceAsset in the downstream dbt job to represent the dependency to the upstream dbt job. Does that make sense?
t
Hello @chris, yes it makes sense, I already started to move the second dbt project to an other repository. I'll have a look at the asset sensor to synchronise the two jobs. Thanks for the input !