Linton Ye
02/20/2024, 7:58 PMScout
02/20/2024, 7:59 PMLinton Ye
02/20/2024, 8:00 PMScout
02/20/2024, 8:00 PMmultiprocess_executor
for your job, which allows each run to execute in a separate process. However, within each run, the steps will still execute sequentially in the same process. Here's how you can configure your job to use the `multiprocess_executor`: python from dagster import job, op, multiprocess_executor @op def my_op(): # Your op logic here pass @job(executor_def=multiprocess_executor) def my_job(): my_op()
When you execute my_job
, each run will be in its own process, but the steps within that run will execute in the same process, one after the other. This setup allows you to have parallelism at the run level while maintaining sequential execution of steps within each run.Linton Ye
02/20/2024, 8:13 PMScout
02/20/2024, 8:13 PMmultiprocess_executor
in Dagster allows for running each step of a pipeline in a separate process, which is not what you're looking for since you want all steps of a run to execute in the same process. If you want to start multiple runs in different processes while keeping all steps of a run in one process, you can use the subprocess_executor
. This executor is not part of the standard executors provided by Dagster, but you can implement a custom executor that starts each run in a separate subprocess. However, as of my last update, Dagster does not provide a built-in subprocess_executor
that directly supports this behavior. You would typically manage separate runs in different processes at a higher level, outside of Dagster's execution environment, such as by using a workflow orchestrator, a job scheduler, or a script that invokes Dagster runs in separate processes. If you want to ensure that multiple runs are executed in parallel, each in its own process, you might consider using Dagster's support for run launchers and executors that integrate with systems like Dask, Kubernetes, or Celery, which can manage the distribution of runs across multiple processes or nodes. For the most accurate and up-to-date information, please refer to the official Dagster documentation or check for any recent updates that might have introduced new features or changes in behavior.Linton Ye
02/20/2024, 8:16 PMScout
02/20/2024, 8:16 PMMultiprocessExecutor
, it can start multiple runs in parallel, each in its own process. Within each of those runs, the steps will execute sequentially when using the in-process executor. To achieve parallel execution of multiple runs, each with steps executing sequentially within the run, you would configure your Dagster instance to use a run launcher that supports parallel execution of runs. The actual execution of steps within each run would still be managed by the in-process executor, ensuring that steps do not run in parallel within a single run. For more details on how to configure run launchers and executors to achieve the desired parallelism at the run level, please refer to the Dagster documentation on run launchers and executors.