https://dagster.io/ logo
#integration-dbt
Title
# integration-dbt
b

Brendan Jackson

05/31/2023, 12:10 PM
Hi - perhaps a naive question, but is there any way to allow for a software defined asset to depend on a dbt asset that is materialised as a view only? The motivation here is to remove much of the complexity of the IO manager into simply
select * from {table}
, as in one of the documentation examples. I imagine this is not done, because it would require the loading of the view to happen within a dbt run (where the view still exists)?
dagster bot responded by community 1
g

Guy McCombe

05/31/2023, 12:12 PM
Yeah I don’t see why not
b

Brendan Jackson

05/31/2023, 12:14 PM
Adding the dependency is fine. But would an IO manager work to load data from the dbt view?
For instance, this example -
Copy code
def load_input(self, context) -> pd.DataFrame:
        """Load the contents of a table as a pandas DataFrame."""
        table_name = context.asset_key.path[-1]
        return pd.read_sql(f"SELECT * FROM {table_name}", con=self.connection_str)
But
{table_name}
will not exist.
g

Guy McCombe

05/31/2023, 12:16 PM
Yeah that should work no problem. As long as you can run SQL against the view, you can define an io_manager like that
Is the
table_name
not just the name of your view?
b

Brendan Jackson

05/31/2023, 12:16 PM
Well that's from the documentation example, so I assume so.
Ah, perhaps you are right - the view is persisted.
How does this function if the asset is partitioned?
g

Guy McCombe

05/31/2023, 12:18 PM
You can define an io manager that handles partitions if that’s what you’re after
If you give me a bit of context on the partitions and how you want to partition your view I can see if I can give you a kick start
b

Brendan Jackson

05/31/2023, 12:20 PM
Thanks! I think I see what you mean. I have a dbt asset A partitioned into days, produced via incremental loading. I would like to create a downstream dbt asset B that does some joins/transformations, also partitioned in this way. Then I have a software defined asset C that depends on B. I would like B to be a view, for the sake of storage space mainly.
👍 1
Currently I am creating B via a dbt view using the partition fn in the dagster:
Copy code
where date > '{{ var("start_date") }}' and date <= '{{ var("end_date") }}'
Where those parameters are set by the partition in dagster.
partition_key_to_vars_fn
Were
B
just a regular table, I would adopt incremental materialisation. I can't do that here. I could specify the view without where clause, and use the IO manager to do that part?
g

Guy McCombe

05/31/2023, 12:23 PM
Yeah I’d say do that ^
👍 1
b

Brendan Jackson

05/31/2023, 12:23 PM
Thanks!
I think I was assuming I would need to make the dbt view specific to the partition, but that isn't needed at all as it's just a view.
🎯 1
g

Guy McCombe

05/31/2023, 12:27 PM
You could probably do something similar to this for your IO manager:
Copy code
def load_input(self, context) -> pd.DataFrame:
    """Load the contents of a table as a pandas DataFrame."""
    table_name = context.asset_key.path[-1]
    if context.has_partition_key:
        return pd.read_sql(f"SELECT * FROM {table_name} WHERE date={context.asset_partition_key}",
                           con=self.connection_str)
    return pd.read_sql(f"SELECT * FROM {table_name}", con=self.connection_str)
👍 1
2 Views