I have an op that extracts large amount of data from an external data source by running a sql query . I tried solving this by creating multiple partition queries and yielding the each query result as a dynamic output. We want to run these queries in parallel to reduce the execution time. we have tried with multiprocess execution in graph.to_job, but still the queries are executing in a sequential manner. is there a way to run these queries in parallel using dagster?
02/09/2022, 3:34 AM
Hi, the multiprocess executor should allow multiple ops to run in parallel if they don't have any dependencies on each other. Is each of the queries that you mention happening in its own op? If they're all within the same op, the multiprocess executor won't help
02/09/2022, 4:04 AM
Yes, they are part of same op.
Thanks for your quick response .
I just want to confirm if this is true even if this OP is a generator(dynamic_output)?