clay
03/22/2023, 10:03 PM@asset(required_resource_keys={"snowflake"}, compute_kind="Python")
def calc_jira_kpis(context: OpExecutionContext, jira_tickets_source_asset: pd.DataFrame) -> None:
"""
Calculate Jira-derived KPIs.
"""
.... stuff
return None
clay
03/22/2023, 10:04 PMdagster._core.errors.DagsterInvalidDefinitionError: Input asset '["jira_tickets_source_asset"]' for asset '["calc_jira_kpis"]' is not produced by any of the provided asset ops and is not one of the provided sources
clay
03/22/2023, 10:05 PMclay
03/22/2023, 10:40 PMclay
03/24/2023, 1:58 PMUSE SECONDARY ROLES ALL
before executing queries.clay
03/24/2023, 2:01 PMcontext.resources.snowflake.execute_queries(sql_queries=queries)
because it doesn't first USE SECONDARY ROLES ALL
Dagster Jarred
04/05/2023, 9:50 PMNeil
04/07/2023, 9:15 PMJoel Olazagasti
04/14/2023, 3:07 PMkey_prefix
for schema information, and only use the schema value from the passed Snowflake IO manager? We run & test our pipelines in different schema configurations based on local dev, branch qa, and production, but ideally would like the visual organization of the assets in the UI to stay consistent between environments. So the goal would be to have static key_prefixes
and dynamic io_manager schema configuration, that won't overlap in valuesjamie
04/14/2023, 7:45 PMSnowflakePandasIOManager
(api docs) and SnowflakePySparkIOManager
(api docs) that follow the new Pythonic Config and Resources system. The existing snowflake_pandas_io_manager
and snowflake_pyspark_io_manager
are still and will continue to be supported. However, we think that the new Pythonic system has huge ergonomic benefits so we recommend you check it out! You can read more about the Pythonic Config and Resources system here , and expect to see even more resources (including updated guides) in the coming weeks!clay
04/17/2023, 4:17 PMclay
04/19/2023, 2:38 AMsnowflake_resource.configured
, are the subsequent asset materializations using the same snowflake session?
I'm having a problem trying to run two jobs in different warehouses simultaneously when I know I have access to both warehouses. I'm getting the Object does not exist, or operation cannot be performed.
error when I try to call USE WAREHOUSE ..
when I can easily use that warehouse in DBeaver, for instance.
Is there a way to print the connection URL that the snowflake connector is using?jamie
04/20/2023, 8:20 PMdagster-snowflake
0.19.0 ❄️
Happy 1.3 release everyone!
• The Snowflake tutorial and reference page have been updated to use the pythonic resources and config versions of Snowflake I/O managers. If you are still using the pre-0.19.0 versions of these I/O managers, you can still see the older versions of the tutorial and reference page by selecting a prior version of the documentation in the dropdown at the top of the page
• This release includes a breaking change for the SnowflakePandasIOManager
and any other I/O managers using the SnowflakePandasTypeHandler
. Due to a longstanding issue with the snowflake-python connector, Timestamp data cannot be directly stored in Snowflake (the data get’s garbled and stored as non-sensical dates). Prior to this release, we converted all timestamp data to strings when storing dataframes, and did the reverse conversion when loading dataframes. However, you can also avoid the issue by ensuring that timestamp data has a timezone attached to it. We think this is a less invasive change, and so the SnowflakePandasIOManager
will now attach the UTC timezone to timestamp data that does not already have a timezone. However, if you would like to continue storing timestamp data as strings, or you have already materialized timestamp data and do not want to migrate the corresponding Snowflake table to a table with a TIMESTAMP column, you can set the configuration value store_timestamps_as_strings=True
. If you have materialized an asset with timestamp data that was stored as strings and would like to migrate the corresponding table so that you can store timestamp data as timestamps, you can follow the instructions in the migration guide Stephen Bailey
04/21/2023, 1:06 PMfrom dagster import asset
from dagster_snowflake import SnowflakeResource
@asset
def my_snowflake_thing(snowflake: SnowflakeResource):
results = snowflake.execute_query("select 1 as foo")
results_as_df = snowflake.execute_query("select 2 as bar", as_df=True)
# do things
Joel Olazagasti
04/25/2023, 6:28 PMupdated_at
field or similar) from an observable_source_asset
that has a Snowflake IO manager?clay
04/25/2023, 10:14 PMclay
04/25/2023, 10:14 PMALTER SESSION SET query_tag = "whatever"
clay
04/26/2023, 1:17 PMjamie
04/27/2023, 8:46 PMjamie
05/05/2023, 4:29 PMSnowflakeResource
follows the new Pythonic Config and Resources system. The existing snowflake_resource
will continue to be supported. This SnowflakeResource works slightly differently than the snowflake_resource
. Using the new SnowflakeResource
you create a SnowflakeConnection
(snowflake docs) and use that to run queries
from dagster import asset
from dagster_snowflake import SnowflakeResource
@asset
def my_snowflake_asset(snowflake: SnowflakeResource):
with snowflake.get_connection() as conn:
return (
conn.cursor()
.execute(
"SELECT * FROM IRIS_DATASET WHERE 'petal_length_cm' < 1 AND"
" 'petal_width_cm' < 1"
)
.fetch_pandas_all()
)
Also, if you haven’t yet taken a look at the GitHub discussion about handling timestamp data, please let us know your thoughts!Dan Meyer
05/08/2023, 12:49 AMjamie
05/12/2023, 2:34 PMbase64
encode your key and the resource and I/O manager will decode the key and use it for authentication. If you are already successfully supplying a private key, then your code is still supported and no changes need to be made.Joe
05/12/2023, 6:35 PMSebastian Charrier
05/15/2023, 12:10 AMclay
05/16/2023, 4:10 PMSimrun Basuita
05/17/2023, 10:32 AMsnowflake_pandas_io_manager
to only load part of a table for a downstream asset? i.e. add a WHERE clauseSimrun Basuita
05/17/2023, 1:47 PMCREATE TABLE ....
), how can I get downstream assets to load that using snowflake_pandas_io_manager
? i.e. how to declare that the asset lives in snowflake?Rohan Meringenti
05/25/2023, 4:02 PMsnowflake_pandas_io_manager
as the docs say and time partitions to do this, but it looks like an @asset
always recreates the table. Was wondering what the suggested route was to go about updating a table on a daily basis in snowflake?Stephen Bailey
05/26/2023, 11:26 AMfoo1
from here, but it failed, then i expected to run foo2
like here but it failed. Then i tried foo3
from here, and that worked for me.
The pattern I'd prefer to encourage others to use is the first -- is that supported?
from dagster import Definitions, asset, OpExecutionContext
from dagster_snowflake import SnowflakeResource
@asset
def foo(context: OpExecutionContext, snowflake: SnowflakeResource):
result = snowflake.execute_query("select 1")
<http://context.log.info|context.log.info>(result)
@asset
def foo2(context: OpExecutionContext, snowflake: SnowflakeResource):
with snowflake.get_connection() as conn:
result = conn.execute_query("select 1")
<http://context.log.info|context.log.info>(result)
@asset
def foo3(context: OpExecutionContext, snowflake: SnowflakeResource):
with snowflake.get_connection() as conn:
result = conn.cursor().execute("select 1")
<http://context.log.info|context.log.info>(result)
defs = Definitions(
assets=[foo, foo2, foo3],
resources={
"snowflake": SnowflakeResource(
account="whatnot-prod",
user=SNOWFLAKE_USER,
password=SNOWFLAKE_PASSWORD,
role="STAGE_ADMIN_ROLE",
warehouse="STAGE_LOAD_WH",
database="STAGE",
)
}
)
Matt Clarke
05/31/2023, 12:30 PM