Carlos Sanoja
09/07/2021, 3:19 PMinfo 1 --> solid1 --> solid2 --> solid3
info 2 --> solid1 --> solid2 --> solid3
info 3 --> solid1 --> solid2 --> solid3
.
.
and so on, for this I am generating a dynamic output (since in principle I don't know how many sub tasks I will generate beforehand, it can be 3 or 7 or any number).
The problem I have is that when processing several subtasks in parallel the computer runs out of resources (memory) and dagster kills the processes. This is because solid1 runs a machine learning model.
My question is, looking at the attached image and taking into account tests I have done, I notice that dagster automatically generates 4 subprocesses and as it finishes it generates the next ones. For this example, I have 7 input data, it divides them in 4 and then in 3, once the previous ones are finished.
Is there any way to limit or control the generation or parallelization of tasks from the macro? currently I generate the dynamic outputs and use the .map() function to apply the requests to each output.
I hope you can help meGeorge Pearse
09/07/2021, 4:03 PMCarlos Sanoja
09/07/2021, 4:11 PM