https://dagster.io/ logo
#ask-community
Title
# ask-community
d

Diego Fabiano

08/25/2023, 8:07 PM
I was wondering if there was a way to change the default behavior where if you have ‘upstream_op().map(downstream_op)’ dagster waits to collect all the yields of the ‘upstream_op’ before starting to queue in parallel the ‘downstream_op’ for each?
j

jamie

08/25/2023, 8:12 PM
Just to make sure i’m understanding, you want all of upstream_op to finish before any of the downstream ops start? Or do you want a downstream_op to start as soon as it’s corresponding upstream_op’s Ouput has been yielded, even if other outputs are still being yielded?
d

Diego Fabiano

08/25/2023, 8:19 PM
The later one for sure! (Over simplified) but I have an op that collects documents from a source which is yielding them as dynamic outputs, I would want the corresponding downstream op for that document to start right away Right now it seems to be waiting to collect all the outputs by the io_manager
j

jamie

08/25/2023, 8:23 PM
yeah it will wait for all of the outputs to be yielded. I don’t think there’s really a way around this right now. it comes down to how the execution graph is constructed. Basically we can’t make the downstream_op steps until we know how many outputs upstream_op has yielded