https://dagster.io/ logo
#integration-snowflake
Title
# integration-snowflake
s

Serge Gekker

03/20/2023, 11:29 PM
Curious to here if anyone has used Snowflake's Snowpark as part of a dagster pipeline. I'm wondering if this would be well suited for batch inference when data is in Snowflake.
t

Tim Castillo

03/20/2023, 11:44 PM
I personally haven't, but I believe it should be doable. You'll have to find a way to manage the sessions yourself, but there is nothing about Dagster that would prevent you from creating a Snowpark session. One thing to keep in mind is that if you are also using
dagster-snowflake
, the package does install
snowflake-connector-python
so you'll have to be cautious of the version installed; luckily, we don't pin the version. You'll be able to take a look at the dependencies here. If you're not ready to jump into Snowpark yet, the Snowflake I/O manager supports PySpark for in-memory compute.
thank you box 1
j

Jason

03/21/2023, 7:18 PM
I've been thinking about this as well - another gotcha is that Snowpark only supports Python 3.8. So if you're already on something higher (like me) then you may have to consider downgrading if possible or using multiple code locations (adding complexity to your project if not already there)
s

Serge Gekker

03/23/2023, 1:44 PM
@Tim Castillo managing the session resource seems to be the problem because a new session_id gets spun up for each asset execution. Does dagster have a way of using a singleton pattern for ConfigurableResources_?_
50 Views