Hi! Is there a way to tell dagster to parallelize ...
# ask-community
k
Hi! Is there a way to tell dagster to parallelize graphs within a job, but run each graph in one process? Let's say I have 4 graphs bundled into a job that are supposed to be run in parallel and have (nothing-)dependencies between them. However, each graph X handles an external program session that cannot be serialized and passed around between nodes of graph X. Because of that, if I want to use the multiprocess executor, I'm forced to pack all work in graph X into literally one
op
although I'd like to still divide it into multiple units of computation. Thanks! P.S.: In addition, if there's a way to control concurrency of graphs/operators as suggested here and here, please let me know 🙂
o
Hi @Kobroli! This is definitely something we've had people ask for in the past (and are thinking about on our end), but I couldn't find an exact issue for it, so I created a new one here: https://github.com/dagster-io/dagster/issues/7251. I think for now, your best bet is the solution you suggested (despite its drawbacks 😞). Similarly, we don't currently support concurrency limits for specific ops/graphs in the multiprocess_executor.