Seth Kimmel
02/06/2023, 5:57 PMclay
02/06/2023, 6:19 PMnon_argument_deps
to make sure dependencies were properly tracked.clay
02/06/2023, 6:19 PMSeth Kimmel
02/06/2023, 7:31 PMclay
02/06/2023, 7:35 PMStephen Bailey
02/06/2023, 7:35 PMjamie
02/06/2023, 7:40 PMit’s no longer “tracked” within the asset ecosystemcan you elaborate a bit more on what you mean by this? keeping the snowflaek query as an op would lose some of the additional features around assets, but just want to make sure that’s what you’re referring to
jamie
02/06/2023, 7:44 PMnon_argument_deps
and executing the query directly in snowflake seems like the way to go. that’ll allow you to integrate the asset in with the rest of your assets and set it as an upstream dependency, take advantage of freshness policies, asset reconicilliation, etcSeth Kimmel
02/06/2023, 7:57 PMStephen Bailey
02/06/2023, 8:18 PM{"name": "..."}
, etc. can be useful for downstream operations to take advantage of, especially when you get into cross-system dependencies -- for example, an upstream asset that is going to need to know the name of the snowflake table, but not the actual df
of its contents.Seth Kimmel
02/06/2023, 8:21 PMStephen Bailey
02/06/2023, 8:22 PM@asset(non_argument_deps = {"snowflake_table_1_key", "snowflake_table_2_key"})
def training_Job():
job_id = sagemaker.execute_training_job(...)
results_dict = sagemaker.get_training_Job(job_id)
return results_dict
@asset
def model(training_job):
model_Id = sagemaker.create_model(training_Job_id=training_job["name"])
results_dict = sagemaker.get_training_Job(job_id)
return results_dict
@asset
def endpoint(model):
...
return results_dict
keeping the lineage clean is really useful, as it lets you build cross-system lineage, which is where assets really pay off imoSeth Kimmel
02/06/2023, 8:26 PMsandy
02/07/2023, 12:17 AMNone
, we consider it a software-defined asset because it defines how to produce a particular data asset. By using the @asset
decorator with a None
output, the developer is kind of agreeing to a "contract" with the framework that they will materialize the asset when the decorated function is invoked.
IO managers make it more ergonomic to write the code that materializes that data asset, but they're not fundamental to the paradigm.
Here's a little more on this subject: https://docs.dagster.io/tutorial/assets/non-argument-deps#assets-without-arguments-and-return-values