https://dagster.io/ logo
Title
k

Kalyan katamreddi

02/09/2022, 3:27 AM
Hi all, I have an op that extracts large amount of data from an external data source by running a sql query . I tried solving this by creating multiple partition queries and yielding the each query result as a dynamic output. We want to run these queries in parallel to reduce the execution time. we have tried with multiprocess execution in graph.to_job, but still the queries are executing in a sequential manner. is there a way to run these queries in parallel using dagster?
d

daniel

02/09/2022, 3:34 AM
Hi, the multiprocess executor should allow multiple ops to run in parallel if they don't have any dependencies on each other. Is each of the queries that you mention happening in its own op? If they're all within the same op, the multiprocess executor won't help
k

Kalyan katamreddi

02/09/2022, 4:04 AM
Yes, they are part of same op. Thanks for your quick response .
I just want to confirm if this is true even if this OP is a generator(dynamic_output)?
d

daniel

02/09/2022, 2:46 PM
thats correct - it sounds like you would be interested in the feature request here though: https://github.com/dagster-io/dagster/issues/4041
👍 1
k

Kalyan katamreddi

02/10/2022, 3:23 AM
sure, we will check this. Thank you