Hey guys, does `load_assets_from_dbt_manifest` alr...
# integration-dbt
d
Hey guys, does
load_assets_from_dbt_manifest
already separate my dbt models with custom schemas as separate assets?
t
It'll depend on at what stage the manifest was built, but if the manifest already has the all the namespace stuff like schemas and DB figured out, then yes. Are you experiencing something different?
d
Ok. I have the following situation: 1. I want to load my manifest file only once, with
load_assets_from_dbt_manifest
2. I want to create jobs using
define_asset_job
where I can make select statements and exclude statements from the dbt loaded manifest asset. I was trying to do this via AssetSelection.groups and .keys but had different results compared to creating a specific asset using
load_assets_from_dbt_manifest
where I could specify select and exclude statements.
I ended up having to declare multiple
load_assets_from_dbt_manifest
, each one with different select and exclude statements, and then loading them to the repo definition. I’m not sure this is the best way to do this
t
Aaaah, okay thanks for the context! You should only have to load it once. there are a couple resources to help you define jobs as you'd like: Have you seen the arguments available on the dbt loading functions? You can define the asset key explicitly if you want with the
node_info_to_asset_key
parameter on the
load_assets_from_dbt_*
function. If you'd like to organize the groups, there is also the
node_info_to_group_fn
, then you map your models to the exact execution groups, and your Asset Selection will just be that entire group. And finally, have you seen the docs on the asset selection syntax? If not, would this be helpful?
d
Thanks Tim! do you have examples on how to use the
node_info_to_asset_key
and
node_info_to_group_fn
arguments?
t
Not off the cuff (but I should work on that). I think all of these mapping fn arguments have similar call signatures, so here's an example of one being used to map metadata out of a node: https://github.com/dagster-io/hooli-data-eng-pipelines/blob/4c6a744d51a9510dc9adabcfbfe99f6d066098f4/hooli_data_eng/assets/dbt_assets.py#L13 In this case,
node_info_to_asset_key
returns an asset key, which can be built like these examples. And
node_info_to_group_fn
just expects a string back and it'll create/add it to the group.
Ignore how that hooli example has so many
load_assets_from_dbt_project
that selects specific models. 😅 We'd like to avoid that pattern and we're working on the ergonomics to prevent it, but I think in your case, you shouldn't need to.
d
Alright! And when designing these mapping fn arguments, how should I define the key, values? Would it be something like “dbt_custom_schema”:“group”/“asset key” or “dbt_model_tag”:“group”/“asset key”
just not sure I understood properly what keys dagster is going to look for in my mapping fn
And also, Tim, is there a difference between
AssetSelection.groups("my_group").upstream()
and
AssetSelection.assets(*my_asset)
? assuming
my_group
and
my_asset
represent the same thing
t
That'll depend on how you're mapping your models to decide the custom schema. the first one you mention would work, but you might be able to piggyback off of existing metadata from dbt. In this context, Dagster doesn't care what the keys are, they should just be keys that work and make sense for you that you can select from. to be clear, you'll likely need to only do one of the other: define an explicitly easily selectable asset key, or categorizing your groups.
The selections would be different if you had more than one, unrelated assets in the group.
d
Got it. I would probably have to load the manifest’s metadata, create key, value pairs based on the criteria I choose and use that as a mapping fn. Is that it?
t
yesssiir!
d
Thanks @Tim Castillo ! And if you find any more examples of this please don’t hesitate to send my way! Have a nice weekend