Hi, all. I have a question that I've been trying t...
# ask-community
m
Hi, all. I have a question that I've been trying to get answered on my own for the last couple of days. I created this example to better illustrate the situation I'm thinking about. I have a very simple job that produces some Assets (
col1
,
col2
,
my_df
) and then the Op
log_sums
logs the results of a simple transformation. My question is if there is a way to not have to restate in the job definition that
my_df
takes
col1
and
col2
as upstream dependencies before I can pass the result of
my_df
into
log_sums
? It seems to me that I have already specified the dependencies between the assets. In this case, it is not that big of a deal since we are only talking about a small number of assets, but this could be rather tedious to do with a much larger DAG of assets. I figure there might be a way to do something succinct like:
Copy code
@job
def log_sums_job():
    df = do_something_to_materialize_result(my_df)
    log_sums(df)
Anyways, this might be just be a conceptual misunderstanding on my part. But I appreciate any suggestions or pointers.
z
Dependency inference does indeed work the way you think it should, but you're not supposed to mix assets into @job definitions. @job definitions are for defining op-based DAGs. To define an asset job you use
define_asset_job
, which will handle inferring the inputs to each asset. I also don't think you can really mix ops and assets - https://docs.dagster.io/concepts/ops-jobs-graphs/jobs#from-software-defined-assets
keanu thanks 1
m
Copy code
I also don't think you can really mix ops and assets
I was beginning to suspect something like this, but I hadn't seen it explicitly stated.
Somehow, the distinction between a op-based job and an asset-based one had escaped me.
z
Yeah I agree that this could be better highlighted, it comes up pretty often in this channel that folks try to mix assets and ops as nothing really prevents you from doing so until you get weird errors and it's not really documented
🌈 1
c
You actually kind of can mix ops and assets; zach is totally right regarding the way dependencies work, but you can also specify a dependency to your downstream op job on my_df: https://github.com/dagster-io/dagster/discussions/10802
keanu thanks 1