Installation set-up: I'm trying to run dagster o...
# deployment-kubernetes
e

Installation set-up:

I'm trying to run dagster on Kubernetes. I use the default values.yaml file and the only change I'm making is replacing "docker.io" with "registry.hub.docker.com" as my network doesn't allow access to "docker.io" registry. The 4 pods are created: NAME READY STATUS RESTARTS AGE dagster-daemon-5b77cb897f-x5hwn 0/1 Init:0/2 0 106s dagster-dagster-user-deployments-k8s-example-user-code-1-54dhhk 1/1 Running 0 106s dagster-dagster-webserver-5bd447d567-4btqc 0/1 Init:0/2 0 106s dagster-postgresql-0 1/1 Running 0 106s As you see, daemon and webserver are in init state and they remain in this STATUS indefinitely.

Error messages and logs:

There are no failure events at the bottom when I run the 'Kubectl describe':
Copy code
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  3m36s  default-scheduler  Successfully assigned dagster/dagster-daemon-5b77cb897f-btj4k to server
  Normal  Pulling    3m35s  kubelet            Pulling image "registry.hub.docker.com/library/postgres:14.6"
  Normal  Pulled     3m25s  kubelet            Successfully pulled image "registry.hub.docker.com/library/postgres:14.6" in 9.028s
  Normal  Created    3m25s  kubelet            Created container check-db-ready
  Normal  Started    3m25s  kubelet            Started container check-db-ready
But the initContainers seem to to be working:
Copy code
Init Containers:
  check-db-ready:
    Container ID:  cri-o://75cd07830cbcf6e7e6872e105e3eee8a267a5223353c93798bad0a69352e9b99
    Image:         registry.hub.docker.com/library/postgres:14.6
    Image ID:      f4d4e40f6b50987185528556a7adb03e8ef5b4ce6c21c4eef0030a025b46bdf4
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      until pg_isready -h dagster-postgresql -p 5432 -U test; do echo waiting for database; sleep 2; done;
    State:          Running
      Started:      Mon, 19 Feb 2024 10:39:04 -0800
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dg7vm (ro)
  init-user-deployment-k8s-example-user-code-1:
    Container ID:
    Image:         registry.hub.docker.com/busybox:1.28
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      until nslookup k8s-example-user-code-1; do echo waiting for user service; sleep 2; done
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dg7vm (ro)
Containers:
  dagster:
    Container ID:
    Image:         registry.hub.docker.com/dagster/dagster-celery-k8s:1.5.11
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      -c
      dagster-daemon run -w /dagster-workspace/workspace.yaml
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment Variables from:
      dagster-daemon-env  ConfigMap  Optional: false
    Environment:
      DAGSTER_PG_PASSWORD:                 <set to the key 'postgresql-password' in secret 'dagster-postgresql-secret'>  Optional: false
      DAGSTER_DAEMON_HEARTBEAT_TOLERANCE:  1800
    Mounts:
      /dagster-workspace/ from dagster-workspace-yaml (rw)
      /opt/dagster/dagster_home/dagster.yaml from dagster-instance (rw,path="dagster.yaml")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dg7vm (ro)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
When I try for more specific messages per each container by running:
Copy code
kubectl logs dagster-daemon-5b77cb897f-x5hwn  -c check-db-ready -n dagster
I get a long list of:
Copy code
dagster-postgresql:5432 - no response
waiting for database
...
Isn't this wierd as the dagster-postgresql-0 POD is actually showing that it is running? Checking for the other iniContainer by running:
Copy code
kubectl logs dagster-daemon-5b77cb897f-x5hwn  -c init-user-deployment-k8s-example-user-code-1  -n dagster
I get this messgae:
Copy code
Error from server (BadRequest): container "init-user-deployment-k8s-example-user-code-1" in pod "dagster-daemon-5b77cb897f-x5hwn" is waiting to start: PodInitializing
I'm new to Kubernetes and your help to shed some light on this issue is much appreciated!