https://dagster.io/ logo
Title
t

Tyler Eason

02/26/2023, 9:00 AM
Good evening! short / question only: how can I debug the health check for a user code deployment? long / context: I have deployed the Dagster helm chart (1.1.14) with its default values on a GKE autopilot cluster, and the example user code deployment pod is failing its readiness check.
Readiness probe errored: command "dagster api grpc-health-check -p 3030" timed out
BUT the twist is, if I manually run the check command myself, it succeeds!
k exec -it <POD NAME> -- dagster api grpc-health-check -p 3030
gRPC connection successful
Has anyone seen similar behavior? EDIT: Turns out the health check was just taking longer than 3 seconds (the default helm value). New question: what affects the speed of the health check?
d

daniel

02/27/2023, 3:33 PM
Hi tyler - the health check does a pretty simple API call out to the server running on the user code deployment pod. You're welcome to tweak the value here if the API call is taking longer than that in your cluster for some reason: https://github.com/dagster-io/dagster/blob/master/helm/dagster/values.yaml#L182
:thankewe: 1