https://dagster.io/ logo
Title
h

Hiroki Hayama

10/13/2022, 8:23 PM
Hi dagster folks. I’m running into an issue where I’m able to run a graph locally, but when I push to production and run it in Dagster Cloud I get Multiprocess executor error.
Multiprocess executor: child process for step get_community_updates unexpectedly exited with code -11
dagster._core.executor.child_process_executor.ChildProcessCrashException

Stack Trace:
  File "/usr/local/lib/python3.8/site-packages/dagster/_core/executor/multiprocess.py", line 214, in execute
    event_or_none = next(step_iter)
,  File "/usr/local/lib/python3.8/site-packages/dagster/_core/executor/multiprocess.py", line 330, in execute_step_out_of_process
    for ret in execute_child_process_command(multiproc_ctx, command):
,  File "/usr/local/lib/python3.8/site-packages/dagster/_core/executor/child_process_executor.py", line 163, in execute_child_process_command
    raise ChildProcessCrashException(exit_code=process.exitcode)
the
get_community_updates
is pulling down a few <100 rows from snowflake and trying to return a pandas dataframe with snowflakes.connector and the fetch_pandas_all() function. I’ve tried increasing the ecs/cpu tags but no luck there. Any tips/pointers would be appreciated
y

yuhan

10/13/2022, 9:00 PM
hey @Hiroki Hayama This channel is primarily monitored by the open source team. I’ve cross-posted your message to the #dagster-cloud channel
h

Hiroki Hayama

10/13/2022, 9:01 PM
ahh okay in the dagster-cloud.. thank you!
y

yuhan

10/13/2022, 10:09 PM
are you able to get the logs from your ecs tasks that provide more information about this crash?
h

Hiroki Hayama

10/13/2022, 10:19 PM
sorry, I’m not an eng and the person who built this out left the company a few weeks ago so trying to troubleshoot 😕
d

daniel

10/14/2022, 1:31 AM
Hey Hiroki - if you have access to the task in the ECS console for your cluster (the run should be tagged with something like ecs/task_arn:arn:aws:ecs:us-west-2:968703565975:task/7b377a4f1a954ac8985248c670d5fd72/e87aa33c3fa84266980d556964e3311d - the raw logs from that task in AWS Cloudwatch might have more hints about why it's crashing). A segfault like that is very likely something about the code within the op - dunno if its possible for us to get access to that code or have a way to reproduce this ourselves. I remember this coming up on a previous thread - did you ever have a chance to try this with the in process executor?