Hello, I am part of a team that has a Dagit instance running on dagster 0.12.9 with an associated daemon and we've had a couple instances of the daemon crashing unexpectedly with the error in the attached image. I just wanted to check if this was an error people were familiar with/with a known cause. The machine the instance is running on is a Linux machine with fairly limited memory if that's relevant.
d
daniel
10/26/2021, 2:24 PM
Hi Carter - I haven't seen this specific error before, what version of Python is this?
The most likely culprit here would be something related to gRPC which dagster uses quite a bit, I see some similar-ish issues here: https://github.com/googleapis/python-pubsub/issues/414https://github.com/grpc/grpc/issues/24897
If it's possible to send us the results of 'pip freeze' and your python version we can try to reproduce on our end. Roughly how often does it happen / do you have exact repro steps or is too random for that?
Separately, the daemon should be able to restart cleanly and pick up where it left off while we get this sorted out - depending on how you're running it there are various ways to do that (e.g. if its in Docker the container can be set to automatically restart on failure).
c
Carter
10/26/2021, 2:37 PM
We are using Python 3.9.5 actually, which I notice is used in #414 that you linked
It doesn't happen super often - it has happened maybe once a week, and we have the daemon running 24/7 generally
I'm not sure about any exact repro steps because as you say it has been fairly random - we have only had this issue on this particular machine
We are using poetry so that impacts the pip freeze, but our pip freeze is