Hi team, I have a question on dagit init container...
# ask-community
h
Hi team, I have a question on dagit init containers (dagster 0.15.8). It seems to create an init container for each user repo. If one fails, it prevents dagit from starting up properly.
until nslookup metrics-repo; do echo waiting for user service; sleep 2; done
Does it mean that if one user repo is problematic, dagit won’ boot up?
It seems the init container for one repo got stuck in running state, and the rest of them stuck in PodInitializing. This causes the dagit instances to stuck in init state. I also observed in the past that this init container doesn’t need to succeed. For a separate dagster deployment that worked, it seemed that it was never able to get a response but still went to completed state. kubectl describe po
Copy code
State:          Terminated
      Reason:       Completed
      Exit Code:    0
kubectl logs
Copy code
waiting for user service
Server:		192.168.0.1
Address:	192.168.0.1:53


*** Can't find metrics-repo.svc.cluster.local: No answer
*** Can't find metrics-repo.cluster.local: No answer
*** Can't find metrics-repo.us-west-2.compute.internal: No answer
*** Can't find metrics-repo.dagster.svc.cluster.local: No answer
*** Can't find metrics-repo.svc.cluster.local: No answer
*** Can't find metrics-repo.cluster.local: No answer
*** Can't find metrics-repo.us-west-2.compute.internal: No answer
in the staging env, where we are having the problems, the logs seem to be slightly different
Copy code
waiting for user service
Server:		192.168.0.1
Address:	192.168.0.1:53

** server can't find metrics-repo.svc.cluster.local: NXDOMAIN

** server can't find metrics-repo.svc.cluster.local: NXDOMAIN

** server can't find metrics-repo.cluster.local: NXDOMAIN

Name:	metrics-repo.dagster.svc.cluster.local
Address: 10.4.251.151

** server can't find metrics-repo.us-west-2.compute.internal: NXDOMAIN


** server can't find metrics-repo.us-west-2.compute.internal: NXDOMAIN

** server can't find metrics-repo.cluster.local: NXDOMAIN