question about running jobs in parallel: To store ...
# ask-community
j
question about running jobs in parallel: To store intermediate values in memory, instead of on-disk, you can use
mem_io_manager
. For
mem_io_manager
to work, all the ops within an execution need to share the same process, which requires using the
in_process_executor
. but what happens is i have parallel jobs which end up running in series. is there way to have the mem_io_manager work with multi process?
t
I do not fully understand that question. Could you rephrase your question? As I understand it: no, there is no way, because for each new process there will be a fresh instance of mem_io_manager and for you cannot swap the io_manager or executor based on would run in parallel.
j
this is what i have currently, i was hoping to run (the op on the left at the same time as the ops on the right
@Tobias Pankrath would i need to setup a different process for the ops that i want to run together
s
would it be accurate to say that you'd like a subset of ops to run in the same process, but other ones to run in a different process? this isn't currently possible (unless you write your own custom Executor, which is not simple), but it's something that we've discussed
j
@sandy i guess that would be the description, although to be more specific, i would like the ability for ops to run concurrently when i use in-memory io managers. the way it works now, the ops on the right wait for the op on the left to finish
@sandy which doesn't make sense, because the one one the left is not dependent on those on the right
s
although to be more specific, i would like the ability for ops to run concurrently when i use in-memory io managers
this is ultimately kind of a limitation of Python - it doesn't have great support for concurrency with shared memory
j
@sandy @Tobias Pankrath ok so given the possibility of using k8s or docker executors, is there a way to let's say group certain steps into one container and then run them in parallel that way?
t
I am new to dagster myself so will sandy have answer this, but my guess is: only on the level of jobs
s
that's not currently possible, but we'd like to ultimately add this capability