https://dagster.io/ logo
#deployment-kubernetes
Title
# deployment-kubernetes
s

Steven Murphy

02/12/2024, 1:51 PM
Cross-posting here: https://github.com/dagster-io/dagster/discussions/19734 Anyone had experience with monitoring K8s code locations, and when they fail to start? I had a transient issue which resulted in a code-location not deploying successfully, and only noticed it when I logged in today. Would like to be more proactive.
a

Andrea Giardini

02/14/2024, 11:25 AM
Are you using liveness/readiness probes in Kubernetes?
s

Steven Murphy

02/14/2024, 11:28 AM
Not at the moment though it's food for thought Do you mean setting them here within the
dagster-user-deployments
of the
values.yaml
?
^^That's the default one I'm looking at Edit: 1.5.6 anyway, not pulled the latest yet
a

Andrea Giardini

02/14/2024, 11:35 AM
yeah. liveness and readiness probes are useful tools in k8s to make sure a pod stays healthy and that no broken pod gets deployed
(nothing to do with dagster, pure k8s)
liveness probes would restart the user-code in case of failuers
s

Steven Murphy

02/14/2024, 11:42 AM
Cool I'll give that a shot. As for monitoring/alerting: if there happens to be a problem that's not transient, presumably that would require some monitoring solution outwith of Dagster? I can chat with our K8s infra guys. Reason I ask is because on the Dagster UI, you get a little warning triangle next to 'Deployments' along the top, was wondering if there's an event that accompanies that.
a

Andrea Giardini

02/14/2024, 11:43 AM
Kubernetes should suffice. With the proper probes Kubenernetes will not rollout a new version of the user-code unless all probles are successful.
s

Steven Murphy

02/14/2024, 11:44 AM
Appreciate the insight, thank you
a

Andrea Giardini

02/14/2024, 11:44 AM
Happy to help 🙌