We have a couple of jobs that run regularly and ha...
# ask-community
d
We have a couple of jobs that run regularly and have a fanned out number of ops that run. Lately we've been seeing that an op will catastrophically fail for a SIGKILL event. We've checked the memory usage of the container that is spun up and that doesn't seem to be an issue. Is there an easy way to identify why the op is getting killed?
Copy code
Multiprocess executor: child process for step refreshed_data_set.ingest_data_and_score[1] was terminated by signal 9 (SIGKILL). This usually indicates that the process was killed by the operating system due to running out of memory. Possible solutions include increasing the amount of memory available to the run, reducing the amount of memory used by the ops in the run, or configuring the executor to run fewer ops concurrently.