Anyone ever see a google kubernetes cluster “repai...
# deployment-kubernetes
l
Anyone ever see a google kubernetes cluster “repairing” and unavilable for > 24 hours? My runs started failing since 9:55am EST yesterday. I don’t think this is due to dagster, but hoping some GKE experts can help
1
a
kubectl get nodes
?
l
I ran
kubectl config current-context
to make sure I was in the right context
kubectl get nodes
hanging ):
a
weird... are the masters repairing or what?
kubectl version
?
l
Unable to connect to the server: dial tcp <IP>  i/o timeout
^ that was from
kubectl get nodes
kubectl version
also hanging
a
did you try to redo the
gcloud clusters get-credentials
command?
l
Copy code
Fetching cluster endpoint and auth data.
WARNING: cluster dagster-cloud-prod is not RUNNING. The kubernetes API may or may not be available. Check the cluster status for more information.
kubeconfig entry generated for dagster-cloud-prod.
a
ok well looks like something is deeply broken. what do the logs say in the console?
l
not sure what these logs are saying but I ended up just creating a new cluster, revoking the agent token for the old cluster, and starting a new agent on the new cluster I can’t delete the old cluster but will probably contact GCS support so I don’t get billed for it