Arun Kumar
04/05/2022, 7:59 PM/dagit_info
endpoint in the dagit's liveness probe does and why can it become slow? We recently are seeing a lot of dagit restarts due to the timeouts on this probe, 0.12.12
rex
04/05/2022, 9:40 PMvalues.yaml
should do the trick:
livenessProbe: {}
Arun Kumar
04/05/2022, 9:45 PMvalues.yaml
, I still see the liveness and startup probes being set to the dagit pods. Do you why this could happen?rex
04/05/2022, 11:41 PMnull
instead of {}
will not work because of our helm schema.
For the startupProbe
, there is an enabled flag that you can set instead. This will disable the startup probe https://artifacthub.io/packages/helm/dagster/dagster/0.12.12?modal=values&path=dagit.startupProbe.enabled.
For the livenessProbe
, looks like we didn’t add a similar flag in this old version. So you have a couple of options here:
1. manually remove the liveness probe from your deployments using kubectl edit
or something similar
2. upgrade to 0.14.0 where this is disabled by default
3. you could try to override the livenessProbe so that it’s still enabled, but make the liveness check always return success. This way, it will never fail and you should not see restarts for your dagit.helm template dagster/dagster -g -s templates/deployment-dagit.yaml --version 0.12.12 --values ./values.yaml
dagit:
livenessProbe:
httpGet: ~
exec:
command:
- true
startupProbe:
enabled: false
startupProbe
removed, and the livenessProbe should always be returning success. Here’s a truncacted snippet of that:
...
volumeMounts:
- name: dagster-instance
mountPath: "/opt/dagster/dagster_home/dagster.yaml"
subPath: dagster.yaml
- name: dagster-workspace-yaml
mountPath: "/dagster-workspace/workspace.yaml"
subPath: workspace.yaml
ports:
- name: http
containerPort: 80
protocol: TCP
resources:
{}
livenessProbe:
exec:
command:
- true
failureThreshold: 3
periodSeconds: 20
successThreshold: 1
timeoutSeconds: 3
volumes:
- name: dagster-instance
configMap:
name: RELEASE-NAME-dagster-instance
- name: dagster-workspace-yaml
configMap:
Arun Kumar
04/05/2022, 11:58 PMhelmVersion=v3 error="dry-run upgrade for comparison failed: values don't meet the specifications of the schema(s) in the following chart(s):\n
dagster:dagit.livenessProbe.exec.command.0: Invalid type. Expected: string, given: boolean\n- dagit.livenessProbe.httpGet: Invalid type. Expected: object, given: null\n" phase=dry-run-compare
rex
04/07/2022, 1:56 AMtrue
-> "true"
Arun Kumar
04/12/2022, 7:24 PMhttpGet
? Looks the ~
does not work. Sorry not very familiar with helm, do you recommend some other value?