Hey guys does `load assets from dbt manifest` already separa dagster #integration-dbt

Hey guys, does `load_assets_from_dbt_manifest` alr...

Dennis Gera

04/14/2023, 7:41 PM

Hey guys, does

load_assets_from_dbt_manifest

already separate my dbt models with custom schemas as separate assets?

Tim Castillo

04/14/2023, 7:44 PM

It'll depend on at what stage the manifest was built, but if the manifest already has the all the namespace stuff like schemas and DB figured out, then yes. Are you experiencing something different?

Dennis Gera

04/14/2023, 8:44 PM

Ok. I have the following situation: 1. I want to load my manifest file only once, with

load_assets_from_dbt_manifest

2. I want to create jobs using

define_asset_job

where I can make select statements and exclude statements from the dbt loaded manifest asset. I was trying to do this via AssetSelection.groups and .keys but had different results compared to creating a specific asset using

load_assets_from_dbt_manifest

where I could specify select and exclude statements.

Dennis Gera

04/14/2023, 8:45 PM

I ended up having to declare multiple

load_assets_from_dbt_manifest

, each one with different select and exclude statements, and then loading them to the repo definition. I’m not sure this is the best way to do this

Tim Castillo

04/14/2023, 8:50 PM

Aaaah, okay thanks for the context! You should only have to load it once. there are a couple resources to help you define jobs as you'd like: Have you seen the arguments available on the dbt loading functions? You can define the asset key explicitly if you want with the

node_info_to_asset_key

parameter on the

load_assets_from_dbt_*

function. If you'd like to organize the groups, there is also the

node_info_to_group_fn

, then you map your models to the exact execution groups, and your Asset Selection will just be that entire group. And finally, have you seen the docs on the asset selection syntax? If not, would this be helpful?

Dennis Gera

04/14/2023, 8:53 PM

Thanks Tim! do you have examples on how to use the

node_info_to_asset_key

and

node_info_to_group_fn

arguments?

Tim Castillo

04/14/2023, 9:03 PM

Not off the cuff (but I should work on that). I think all of these mapping fn arguments have similar call signatures, so here's an example of one being used to map metadata out of a node: https://github.com/dagster-io/hooli-data-eng-pipelines/blob/4c6a744d51a9510dc9adabcfbfe99f6d066098f4/hooli_data_eng/assets/dbt_assets.py#L13 In this case,

node_info_to_asset_key

returns an asset key, which can be built like these examples. And

node_info_to_group_fn

just expects a string back and it'll create/add it to the group.

Tim Castillo

04/14/2023, 9:03 PM

Ignore how that hooli example has so many

load_assets_from_dbt_project

that selects specific models. 😅 We'd like to avoid that pattern and we're working on the ergonomics to prevent it, but I think in your case, you shouldn't need to.

Dennis Gera

04/14/2023, 9:08 PM

Alright! And when designing these mapping fn arguments, how should I define the key, values? Would it be something like “dbt_custom_schema”:“group”/“asset key” or “dbt_model_tag”:“group”/“asset key”

Dennis Gera

04/14/2023, 9:09 PM

just not sure I understood properly what keys dagster is going to look for in my mapping fn

Dennis Gera

04/14/2023, 9:21 PM

And also, Tim, is there a difference between

AssetSelection.groups("my_group").upstream()

and

AssetSelection.assets(*my_asset)

? assuming

my_group

and

my_asset

represent the same thing

Tim Castillo

04/14/2023, 9:24 PM

That'll depend on how you're mapping your models to decide the custom schema. the first one you mention would work, but you might be able to piggyback off of existing metadata from dbt. In this context, Dagster doesn't care what the keys are, they should just be keys that work and make sense for you that you can select from. to be clear, you'll likely need to only do one of the other: define an explicitly easily selectable asset key, or categorizing your groups.

Tim Castillo

04/14/2023, 9:25 PM

The selections would be different if you had more than one, unrelated assets in the group.

Dennis Gera

04/14/2023, 9:36 PM

Got it. I would probably have to load the manifest’s metadata, create key, value pairs based on the criteria I choose and use that as a mapping fn. Is that it?

Tim Castillo

04/14/2023, 9:37 PM

yesssiir!

Dennis Gera

04/14/2023, 10:26 PM

Thanks @Tim Castillo ! And if you find any more examples of this please don’t hesitate to send my way! Have a nice weekend

4 Views

Open in Slack

Previous Next