Hi team, I got this error in Dagster Cloud. You ha...
# dagster-plus
q
Hi team, I got this error in Dagster Cloud. You happen to know any reason why I might be getting this error?
Copy code
Exception: Timed out after waiting 180s for server prod-2f0.dagster:4000. Most recent connection error: dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server. gRPC Error code: UNAVAILABLE Stack Trace: File "/dagster-cloud/dagster_cloud/workspace/user_code_launcher/user_code_launcher.py", line 1597, in _wait_for_server_process client.ping("") File "/dagster/dagster/_grpc/client.py", line 192, in ping res = self._query("Ping", api_pb2.PingRequest, echo=echo) # type: ignore File "/dagster/dagster/_grpc/client.py", line 159, in _query self._raise_grpc_exception( File "/dagster/dagster/_grpc/client.py", line 142, in _raise_grpc_exception raise DagsterUserCodeUnreachableError( The above exception was caused by the following exception: grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1679151694650","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1670","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}" > Stack Trace: File "/dagster/dagster/_grpc/client.py", line 157, in _query return self._get_response(method, request=request_type(**kwargs), timeout=timeout) File "/dagster/dagster/_grpc/client.py", line 132, in _get_response return getattr(stub, method)(request, metadata=self._metadata, timeout=timeout) File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 946, in __call__ return _end_unary_response_blocking(state, call, False, None) File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking raise _InactiveRpcError(state)
plus1 1
j
Is this hybrid or serverless? If hybrid, looking at the logs of the user deployment server that the agent tried to spin up should reveal something
q
Hybrid
a
I had a similar issue. In my case, the security group for the “code location” services didn’t have an ingress rule on port 4000 from the agents.