https://dagster.io/ logo
Title
s

Sam Werbalowsky

01/11/2023, 6:20 PM
I’m working on a dagster POC and having some trouble adapting to the asset-based model. I’m trying to just run a simple fivetran sync and then run a SQL query to create a table based on this data. Is there a way to define an asset as a sql query to run (in this case it’s a merge statement) - I got it to run succesfully as an op, but am unsure how I can pass that downstream to something
specifically, I’d also like to define a snowflake asset as a table but not have to use pandas
r

rex

01/11/2023, 7:06 PM
Hey Sam, there are couple of ways to do this: 1. If you want the SQL transformation to happen in your Snowflake warehouse, you can have dbt run downstream of your fivetran sync. We have a guide for setting setting up dbt here: https://docs.dagster.io/integrations/dbt 2. If you want to do the transformation without dbt, you can configure the
dagster-snowflake
resource and define an asset that runs the one-off sql query in your data warehouse We probably recommend doing (1), if you’re planning on implementing more transformations of your Fivetran synced data in the future.
s

Sam Werbalowsky

01/11/2023, 7:07 PM
due to a few process oriented things, we actually can’t use dbt to do this, so I need to go with option 2.
so is it ok to have the asset just return the results of the query that is essentially
update my_table where….
follow up question - are there best practices for requiring assets upstream, i.e. what would the code look like to have my one-off sql script depend on the fivetran asset? Is there some FivetranOutput that I put into the downstream asset?
r

rex

01/11/2023, 7:14 PM
Re: requiring assets upstream: https://docs.dagster.io/concepts/assets/software-defined-assets#non-argument-dependencies In this case, you are not passing data between your assets, since the transformation happens entirely in your Snowflake warehouse. So you should be using
non_argument_deps
in your downstream asset.
so is it ok to have the asset just return the results of the query
Sure - you could even just have it return
None
if you’re not planning on using the results of the asset in a downstream computation.