https://dagster.io/ logo
#ask-community
Title
# ask-community
s

Sanidhya Singh

08/22/2022, 9:18 AM
Hi Team! I’m running into an issue when I execute a Job that uses a huggingface model. See the error message below
Copy code
Multiprocess executor: child process for step job_1[op_1] unexpectedly exited with code 3221225477
dagster._core.executor.child_process_executor.ChildProcessCrashException

Stack Trace:
  File "c:\programdata\anaconda3\lib\site-packages\dagster\_core\executor\multiprocess.py", line 210, in execute
    event_or_none = next(step_iter)
,  File "c:\programdata\anaconda3\lib\site-packages\dagster\_core\executor\multiprocess.py", line 324, in execute_step_out_of_process
    for ret in execute_child_process_command(multiproc_ctx, command):
,  File "c:\programdata\anaconda3\lib\site-packages\dagster\_core\executor\child_process_executor.py", line 163, in execute_child_process_command
    raise ChildProcessCrashException(exit_code=process.exitcode)
Can this be a memory issue? Does Dagster set a default max limit on the memory that a Job run can use? I’m not using ECS (using local). will appreciate any help!
dagster bot responded by community 1
c

chris

08/22/2022, 8:09 PM
Looks like it could be a memory issue: https://stackoverflow.com/questions/70243762/childprocesscrashexception-in-dagster-multiprocess-execution-in-multi-container Dagster doesn't explicitly set a max limit on the memory that a job can use, and neither does python - so your system might not be able to handle the concurrency load that's being sent. Do you have some super high-memory-volume steps running concurrently?
s

Sanidhya Singh

08/23/2022, 4:14 AM
Hi @chris! it was a memory issue. Updated the Op and the error went away. However, I wonder if we could/should set memory limits similar to how Dagster supports it for ECS? https://docs.dagster.io/deployment/guides/aws
c

chris

08/23/2022, 9:59 PM
I don't think it's super straightforward to limit memory usage in general python - I don't think that memory limits on multiproces executor is a feature we've heavily considered before, I think we'd likely leave it up to the deployment environment to handle that type of thing.
2 Views