Hello team, how are you? I have been working on th...
# ask-community
c
Hello team, how are you? I have been working on the implementation of a pipeline that generates subtasks to process blocks of information that use the same solids so the process would be:
info 1 --> solid1 --> solid2 --> solid3
info 2 --> solid1 --> solid2 --> solid3
info 3 --> solid1 --> solid2 --> solid3
.
.
and so on, for this I am generating a dynamic output (since in principle I don't know how many sub tasks I will generate beforehand, it can be 3 or 7 or any number). The problem I have is that when processing several subtasks in parallel the computer runs out of resources (memory) and dagster kills the processes. This is because solid1 runs a machine learning model. My question is, looking at the attached image and taking into account tests I have done, I notice that dagster automatically generates 4 subprocesses and as it finishes it generates the next ones. For this example, I have 7 input data, it divides them in 4 and then in 3, once the previous ones are finished. Is there any way to limit or control the generation or parallelization of tasks from the macro? currently I generate the dynamic outputs and use the .map() function to apply the requests to each output. I hope you can help me
g
Hi @Carlos Sanoja I think you're after max_concurrent which you can set in the config https://docs.dagster.io/_apidocs/execution#dagster.multiprocess_executor
c
Thank you very much @George Pearse!