I had a simple DBT project: ```models/ - foo - ...
# integration-dbt
g
I had a simple DBT project:
Copy code
models/
	- foo
	- bar
	- baz
And Dagster nicely was extracting foo,bar,baz into asset groups. However, now, I needed to refactor this into:
Copy code
models/
	- snowflake
		- foo
		- bar
		- baz
	- postgres
		- p1
		- p2
		- p3
as you can see one additional level is used here. How can I: - parse the DBT proect such that the snowflake/postgres (first layer) becomse the
compute_kind
label and the second layer (fo,barbaz, p1,p2,p3) is parsed to asset groups?
🤖 1
Notice: These folders are only asset groups - not parts of the asset key. I.e. the physical location is inside the same schema
Secondly - I also need to tell DBT to tag all secondary items with the first level - in order to allow dagster to apply a system specific IO manager.
s
1. you can provide a function for assinging group_names that parses the manifest / node metadata and returns what you want (which I think is
node['fqn'][1]
when you run
load_assets...
Copy code
def extract_group_name_from_node(node: dict) -> str:
    "Used for parsing asset metadata from dbt project"
    if len(node["fqn"]) > 2:
        return f"dbt__{node['fqn'][1]}__{node['fqn'][2]}"
    return f"dbt__{node['fqn'][0]}"
2. what you're really asking with the two separate io_managers is how to load two separate dbt projects using only one project directory (because dbt itself doesn't have a concept of multi-compute models within a single project initialization, unless im mistaken). what comes to mind is that you create two separate
dbt_project.yml
files and have them select for different paths --
model-paths: ["models/snowflake"]
and
model-paths: ["models/postgres"]
. Then you call the load function two separate times -- you'll also need to configure a different
profiles.yml
adatper for each and pass that in at load time..
g
thanks!
I think then (i.e. similar switches would also be required for sqlfluff) it is better to make a 2nd toplevel folder
nod2 1