I m trying to setup an additional code server with Kubernete dagster #ask-ai

I'm trying to setup an additional code server with...

lorenzo tenti

02/15/2024, 11:23 AM

I'm trying to setup an additional code server with Kubernetes/Helm. The code server starts successfully and it's visible from the UI, but then I get

Copy code

│ 2024-02-15 11:16:52 +0000 - dagster.code_server - INFO - Starting Dagster code server for package sample_etl on port 3030 in process 1                              │
│ 2024-02-15 11:16:53 +0000 - dagster.code_server - INFO - Started Dagster code server for package sample_etl on port 3030 in process 1                               │
│ 2024-02-15 11:17:28 +0000 - dagster.code_server - INFO - Stopping server once all current RPC calls terminate or 60 seconds pass                                    │
│ 2024-02-15 11:17:28 +0000 - dagster.code_server - INFO - Code server was interrupted                                                                                │
│ 2024-02-15 11:17:28 +0000 - dagster.code_server - INFO - Shutting down Dagster code server for package sample_etl on port 3030 in process 1

The code server stops and it's not visible anymore from the UI.

Scout

02/15/2024, 11:23 AM

Based on the documentation, when deploying with Helm on Kubernetes, the Dagster webserver communicates with user code deployments via gRPC to fetch information needed to populate the UI. The webserver doesn't load or execute user-written code, which allows the UI to remain available even when user code contains errors. The webserver frequently checks whether the user code deployment has been updated and if so, fetches the new information. The logs you've provided indicate that the Dagster code server for your package

sample_etl

starts successfully but then receives an interrupt signal, leading to its shutdown. This behavior is not typical and suggests that something external to the code server process is causing it to shut down. Here are a few steps you can take to debug this issue: 1. Check Kubernetes Pod Events: Look at the events for the pod where the code server is running to see if there are any indications of why it might be shutting down. You can use the following command:

Copy code

kubectl describe pod <pod-name>

Replace

<pod-name>

with the actual name of your pod. 2. Inspect Kubernetes Pod Logs: Review the logs of the pod for any error messages or warnings that could indicate a problem. You can use the following command:

Copy code

kubectl logs <pod-name>

Again, replace

<pod-name>

with the actual name of your pod. 3. Check Resource Limits: Ensure that the pod has sufficient resources (CPU and memory) and that it's not being terminated due to resource constraints. 4. Review Helm Chart Configuration: Verify that your Helm chart configuration for the code server is correct, especially the

dagster-user-deployments.deployments

section in the

values.yaml

file. Ensure that the

dagsterApiGrpcArgs

and other related configurations are set appropriately. 5. Check for Readiness and Liveness Probes: If you have readiness or liveness probes configured for your code server, ensure that they are not failing and causing the pod to restart. 6. Review Network Policies: If you have network policies in place, make sure they allow traffic to and from the code server on the specified port. If after checking these items you still cannot identify the cause of the issue, you may need to look into more advanced Kubernetes debugging techniques or consider reaching out to the community for support with specific details about your deployment configuration and environment. My Hybrid ECS agent is having trouble co... button socket bind permission denied on k8s button [docs] - Missing user code deployment ex... button [Content Gap] document the grpc health c... button Thank you for your feedback!

12 Views

Open in Slack

Previous Next