https://dagster.io/ logo
Title
p

Philipp

04/04/2023, 7:49 AM
I have a question regarding assets, dependencies and jobs. I thought that if I have a DAG A->B->C and I schedule the materialization of C, then the dependency is inferred by Dagster, and A and B will be materialized as well as C depends on them. But the observation is that C fails due to the inavailability of B. I see that I can overcome this by saying
selection="*"
in
define_asset_job
but wonder if I do this correctly, becasue by doing so, I would include assets as well that are not required, and by being more selective, I would resolve the DAG manually and therefore the job definition would be tightly coupled to the DAG, while by specifying a job I really only want to say "I want C, and I don't care what C depends on, do the needful". Is my conclusion correct or am I using Dagster incorrectly?
:dagster-bot-resolve: 1
v

Vinnie

04/04/2023, 10:08 AM
You can use the
AssetSelection
syntax with
define_asset_job
(https://docs.dagster.io/_apidocs/assets#dagster.AssetSelection). Something like the following should work:
define_asset_job(..., selection=AssetSelection.keys("C").upstream())
👍 1
p

Philipp

04/05/2023, 1:39 PM
Looks great!
My mental model was somewhat different, I thought that in a graph, only the terminal nodes are of interest, but within a job, I can basically materialize each node, and therefore, I need to specify it.