Michael Hood
07/13/2023, 2:16 PMcol1
, col2
, my_df
) and then the Op log_sums
logs the results of a simple transformation.
My question is if there is a way to not have to restate in the job definition that my_df
takes col1
and col2
as upstream dependencies before I can pass the result of my_df
into log_sums
?
It seems to me that I have already specified the dependencies between the assets. In this case, it is not that big of a deal since we are only talking about a small number of assets, but this could be rather tedious to do with a much larger DAG of assets.
I figure there might be a way to do something succinct like:
@job
def log_sums_job():
df = do_something_to_materialize_result(my_df)
log_sums(df)
Anyways, this might be just be a conceptual misunderstanding on my part. But I appreciate any suggestions or pointers.Zach
07/13/2023, 3:21 PMdefine_asset_job
, which will handle inferring the inputs to each asset. I also don't think you can really mix ops and assets - https://docs.dagster.io/concepts/ops-jobs-graphs/jobs#from-software-defined-assetsMichael Hood
07/13/2023, 3:23 PMI also don't think you can really mix ops and assets
I was beginning to suspect something like this, but I hadn't seen it explicitly stated.Michael Hood
07/13/2023, 3:25 PMZach
07/13/2023, 3:29 PMchris
07/13/2023, 4:48 PM