https://dagster.io/ logo
Title
y

Yevhen Samoilenko

05/31/2022, 2:23 PM
Hi! I'm having trouble adding my code to a workspace using Docker Agent. Here is the error:
Exception: Timed out waiting for server user_code_52fa4f:4000. Most recent connection error: dagster.core.errors.DagsterUserCodeUnreachableError: Could not reach user code server Stack Trace: File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 687, in _wait_for_server server_id = sync_get_server_id(client) File "/dagster/dagster/api/get_server_id.py", line 15, in sync_get_server_id result = check.inst(api_client.get_server_id(), (str, SerializableErrorInfo)) File "/dagster/dagster/grpc/client.py", line 152, in get_server_id res = self._query("GetServerId", api_pb2.Empty, timeout=timeout) File "/dagster/dagster/grpc/client.py", line 115, in _query raise DagsterUserCodeUnreachableError("Could not reach user code server") from e The above exception was caused by the following exception: grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "DNS resolution failed for user_code_52fa4f:4000: C-ares status is not ARES_SUCCESS qtype=A name=user_code_52fa4f is_balancer=0: Could not contact DNS servers" debug_error_string = "{"created":"@1654005039.486441814","description":"DNS resolution failed for user_code_52fa4f:4000: C-ares status is not ARES_SUCCESS qtype=A name=user_code_52fa4f is_balancer=0: Could not contact DNS servers","file":"src/core/lib/transport/error_utils.cc","file_line":165,"grpc_status":14}" > Stack Trace: File "/dagster/dagster/grpc/client.py", line 112, in _query response = getattr(stub, method)(request_type(**kwargs), timeout=timeout) File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 946, in __call__ return _end_unary_response_blocking(state, call, False, None) File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking raise _InactiveRpcError(state)
  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 548, in _reconcile
    new_updated_endpoint = self._create_new_server_endpoint(
  File "/dagster-cloud/dagster_cloud/workspace/docker/__init__.py", line 197, in _create_new_server_endpoint
    return self._launch(
  File "/dagster-cloud/dagster_cloud/workspace/docker/__init__.py", line 286, in _launch
    server_id = self._wait_for_server(
  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 693, in _wait_for_server
    raise Exception(
How can I debug the process and find out the reason it occurs?
d

daniel

05/31/2022, 2:25 PM
Hi Yevhen - sorry for the trouble here. In the logs for your agent container, does it say something like "Started container <...>" before that error? Is it possible to pull the logs from that container? ('docker logs <container ID>')
The most common reason i've seen for that error happening is the Dockerfile not having the 'dagster' package installed (we'd like to start automatically including the container logs in the error so that problems like this are easier to debug though)
y

Yevhen Samoilenko

05/31/2022, 2:32 PM
Here is the logs:
2022-05-31 14:28:40 +0000 - dagster_cloud - INFO - Removed container 6022b23fae53fda82c10d587d766cc3d2242995ec870be221e89a2c18268cf94
2022-05-31 14:28:40 +0000 - dagster_cloud - INFO - Starting Dagster Cloud agent...
2022-05-31 14:28:40 +0000 - dagster_cloud - INFO - Loading Dagster repositories...
2022-05-31 14:28:41 +0000 - dagster_cloud - INFO - Updating server for location user_code
2022-05-31 14:28:41 +0000 - dagster_cloud - INFO - Starting a new container for location user_code with image registry/dagster_cloud_image:latest: user_code_afb457
2022-05-31 14:28:42 +0000 - dagster_cloud - INFO - Started container 5a65691d12d44b763fb6b4be4a8edb14f9dec00185d9948e6a5769039f0f6e1c
2022-05-31 14:29:42 +0000 - dagster_cloud - ERROR - Error while updating server for user_code: Exception: Timed out waiting for server user_code_afb457:4000. Most recent connection error: dagster.core.errors.DagsterUserCodeUnreachableError: Could not reach user code server

Stack Trace:
  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 687, in _wait_for_server
    server_id = sync_get_server_id(client)
  File "/dagster/dagster/api/get_server_id.py", line 15, in sync_get_server_id
    result = check.inst(api_client.get_server_id(), (str, SerializableErrorInfo))
  File "/dagster/dagster/grpc/client.py", line 152, in get_server_id
    res = self._query("GetServerId", api_pb2.Empty, timeout=timeout)
  File "/dagster/dagster/grpc/client.py", line 115, in _query
    raise DagsterUserCodeUnreachableError("Could not reach user code server") from e

The above exception was caused by the following exception:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "DNS resolution failed for user_code_afb457:4000: C-ares status is not ARES_SUCCESS qtype=A name=user_code_afb457 is_balancer=0: Could not contact DNS servers"
        debug_error_string = "{"created":"@1654007382.496599775","description":"DNS resolution failed for user_code_afb457:4000: C-ares status is not ARES_SUCCESS qtype=A name=user_code_afb457 is_balancer=0: Could not contact DNS servers","file":"src/core/lib/transport/error_utils.cc","file_line":165,"grpc_status":14}"
>

Stack Trace:
  File "/dagster/dagster/grpc/client.py", line 112, in _query
    response = getattr(stub, method)(request_type(**kwargs), timeout=timeout)
  File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)


Stack Trace:
  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 548, in _reconcile
    new_updated_endpoint = self._create_new_server_endpoint(
  File "/dagster-cloud/dagster_cloud/workspace/docker/__init__.py", line 197, in _create_new_server_endpoint
    return self._launch(
  File "/dagster-cloud/dagster_cloud/workspace/docker/__init__.py", line 286, in _launch
    server_id = self._wait_for_server(
  File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 693, in _wait_for_server
    raise Exception(

2022-05-31 14:29:45 +0000 - dagster_cloud - INFO - Started polling for requests from <https://organization.agent.dagster.cloud>
dagster is installed (this is a slightly modified version of the dockerfile we use with open source dagster, and everything works fine there)
user code container is up and running:
2022-05-31 14:46:49 +0000 - dagster.code_server - INFO - Started Dagster code server for package user_code on port 4000 in process 37
d

daniel

05/31/2022, 2:50 PM
Has this worked before with other images? Or is this the first time getting your docker agent set up?
y

Yevhen Samoilenko

05/31/2022, 2:50 PM
this is the first time
d

daniel

05/31/2022, 2:51 PM
Got it - and you followed the instructions here? https://docs.dagster.cloud/agents/docker/setup#running-the-agent
(i.e. your dagster.yaml includes that
networks:
  - dagster_cloud_agent
bit?
😛lanet-daggy: 1
y

Yevhen Samoilenko

05/31/2022, 3:12 PM
Yeah, it was there, but I really messed up with networks a bit. Thank you!
d

daniel

05/31/2022, 3:12 PM
ah ok - you mean its working now?
y

Yevhen Samoilenko

05/31/2022, 3:16 PM
Actually, yes! It's working! Unbelievable! Thank you so much!
:condagster: 1