I recently ran into an issue with dagster cloud where a process that normally takes 3-10 minutes hung for 77 hours, what process should I use to figure out why this happened. Also, is there any way to add alerting around this?
01/03/2023, 6:02 PM
Hi Sean - are you able to post or DM a link to the run in cloud? We can take a look at our logs and see if there are any clues about why it would be hanging. Do you have a sense of whether it was your op code or Dagster code that was hanging?
01/03/2023, 6:04 PM
It seems to be dagster based on the fact the last message was a