hi all. Came across this issue regarding incremental re-execution, and was wondering how it is related to intermediates. Isn't the desired functionality described in the issue covered by them?
12/10/2020, 9:44 PM
Thanks @Dimitris Stafylarakis for asking this question! CC-ing @sandy who can share more
12/10/2020, 9:50 PM
Hi @Dimitris Stafylarakis - intermediates enable runs to use objects as input that were produced as output by prior runs.
If I understand what stratospark is asking for in that issue, it's the ability to have Dagster automatically decide which steps to run based on what objects exist.
Which of those are you interested in?
12/10/2020, 10:22 PM
not sure I follow entirely..
is it about re-using objects within the same pipeline (intermediates) versus using them in different pipelines?
12/11/2020, 12:07 AM
The difference is: how do we decide what steps to run?
1: The user directly tells Dagster which steps should run.
2: The user tells Dagster "decide which steps to run". Then Dagster looks on the filesystem and runs only the steps that don't correspond to files there.
We support (1), but only have experimental support for (2). In both cases, steps can use the files on the filesystem as inputs, even if they came from different runs.
Does that make sense?