https://dagster.io/ logo
#ask-ai
Title
# ask-ai
l

Linton Ye

02/20/2024, 7:58 PM
If I start two runs with in-process executor, will they run sequentially?
s

Scout

02/20/2024, 7:59 PM
Yes, if you start two runs with the in-process executor, they will run sequentially. The in-process executor is designed to execute ops serially within a single process. This means that if you submit multiple runs, each run will wait for the previous one to complete before starting. This behavior ensures that runs do not execute concurrently when using the in-process executor, which is suitable for scenarios where you want to avoid parallel execution and the associated overheads.
l

Linton Ye

02/20/2024, 8:00 PM
Is there a way to start two runs in different processes, while keep all steps in a run in one process @Scout
s

Scout

02/20/2024, 8:00 PM
Yes, you can start two runs in different processes while keeping all steps within each run in a single process. To achieve this, you can use the
multiprocess_executor
for your job, which allows each run to execute in a separate process. However, within each run, the steps will still execute sequentially in the same process. Here's how you can configure your job to use the `multiprocess_executor`:
Copy code
python from dagster import job, op, multiprocess_executor  @op def my_op():     # Your op logic here     pass  @job(executor_def=multiprocess_executor) def my_job():     my_op()
When you execute
my_job
, each run will be in its own process, but the steps within that run will execute in the same process, one after the other. This setup allows you to have parallelism at the run level while maintaining sequential execution of steps within each run.
l

Linton Ye

02/20/2024, 8:13 PM
this seems wrong @Scout
s

Scout

02/20/2024, 8:13 PM
I apologize for any confusion. To clarify, the
multiprocess_executor
in Dagster allows for running each step of a pipeline in a separate process, which is not what you're looking for since you want all steps of a run to execute in the same process. If you want to start multiple runs in different processes while keeping all steps of a run in one process, you can use the
subprocess_executor
. This executor is not part of the standard executors provided by Dagster, but you can implement a custom executor that starts each run in a separate subprocess. However, as of my last update, Dagster does not provide a built-in
subprocess_executor
that directly supports this behavior. You would typically manage separate runs in different processes at a higher level, outside of Dagster's execution environment, such as by using a workflow orchestrator, a job scheduler, or a script that invokes Dagster runs in separate processes. If you want to ensure that multiple runs are executed in parallel, each in its own process, you might consider using Dagster's support for run launchers and executors that integrate with systems like Dask, Kubernetes, or Celery, which can manage the distribution of runs across multiple processes or nodes. For the most accurate and up-to-date information, please refer to the official Dagster documentation or check for any recent updates that might have introduced new features or changes in behavior.
l

Linton Ye

02/20/2024, 8:16 PM
in fact in-process executor jobs can indeed run in parallel, you are wrong again @Scout
s

Scout

02/20/2024, 8:16 PM
I apologize for the confusion earlier. You are correct; while the in-process executor runs steps sequentially within a single process for a given run, multiple runs can indeed execute in parallel if they are managed by a system that supports concurrency, such as a run launcher that dispatches each run to a separate process. In Dagster, the run launcher is responsible for determining how runs are initiated. If you use a run launcher that supports concurrency, such as the default
MultiprocessExecutor
, it can start multiple runs in parallel, each in its own process. Within each of those runs, the steps will execute sequentially when using the in-process executor. To achieve parallel execution of multiple runs, each with steps executing sequentially within the run, you would configure your Dagster instance to use a run launcher that supports parallel execution of runs. The actual execution of steps within each run would still be managed by the in-process executor, ensuring that steps do not run in parallel within a single run. For more details on how to configure run launchers and executors to achieve the desired parallelism at the run level, please refer to the Dagster documentation on run launchers and executors.