I’ve been having a lot of issues with the dagster ...
# dagster-serverless
z
I’ve been having a lot of issues with the dagster agent over the past two days. Seems a bit off and on. Lots of
dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server. gRPC
with either
Error Code: Unreachable
or
Error Code: Unavailable
. Is there some service issue going? I’ll post more of the errors in the thread.
Copy code
dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server. gRPC Error code: UNKNOWN

Stack Trace:
  File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 807, in _process_api_request
    api_result = self._handle_api_request(

  File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 602, in _handle_api_request
    serialized_subset_result_or_error = client.external_pipeline_subset(

  File "/dagster/dagster/_grpc/client.py", line 291, in external_pipeline_subset
    res = self._query(

  File "/dagster/dagster/_grpc/client.py", line 157, in _query
    self._raise_grpc_exception(

  File "/dagster/dagster/_grpc/client.py", line 140, in _raise_grpc_exception
    raise DagsterUserCodeUnreachableError(
Copy code
dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server. gRPC Error code: UNAVAILABLE

Stack Trace:
  File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 807, in _process_api_request
    api_result = self._handle_api_request(

  File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 602, in _handle_api_request
    serialized_subset_result_or_error = client.external_pipeline_subset(

  File "/dagster/dagster/_grpc/client.py", line 291, in external_pipeline_subset
    res = self._query(

  File "/dagster/dagster/_grpc/client.py", line 157, in _query
    self._raise_grpc_exception(

  File "/dagster/dagster/_grpc/client.py", line 140, in _raise_grpc_exception
    raise DagsterUserCodeUnreachableError(
j
hey Zach whats your organization id?
z
locallogic
These seem to be happening when trying to re-execute an asset that previously failed — larger stack trace here for that:
Copy code
dagster._core.errors.DagsterUserCodeUnreachableError: dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server. gRPC Error code: UNKNOWN

Stack Trace:
  File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 807, in _process_api_request
    api_result = self._handle_api_request(
  File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 602, in _handle_api_request
    serialized_subset_result_or_error = client.external_pipeline_subset(
  File "/dagster/dagster/_grpc/client.py", line 291, in external_pipeline_subset
    res = self._query(
  File "/dagster/dagster/_grpc/client.py", line 157, in _query
    self._raise_grpc_exception(
  File "/dagster/dagster/_grpc/client.py", line 140, in _raise_grpc_exception
    raise DagsterUserCodeUnreachableError(

The above exception was caused by the following exception:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "Exception calling application: No server created with the given handle"
	debug_error_string = "{"created":"@1680804532.410026591","description":"Error received from peer ipv4:10.0.133.188:4000","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Exception calling application: No server created with the given handle","grpc_status":2}"
>

Stack Trace:
  File "/dagster/dagster/_grpc/client.py", line 155, in _query
    return self._get_response(method, request=request_type(**kwargs), timeout=timeout)
  File "/dagster/dagster/_grpc/client.py", line 130, in _get_response
    return getattr(stub, method)(request, metadata=self._metadata, timeout=timeout)
  File "/usr/local/lib/python3.10/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/usr/local/lib/python3.10/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
Could also be caused by other issues, not sure.
d
Hey Zach - that second half that starts with "The above exception" in the last one you posted is helpful to diagnose, do you possibly have that for the first two that you posted?
z
The first two were from the dagster cloud > deployment > agents > errors tab, I’m not sure how to get the entire stack trace there.
d
Are these errors on your prod deployment or branch deployments? (Or both)
z
Havent seen them outside of a branch deployment yet.
We had a succesful run in prod on another asset earlier this morning
could this be fixed by 1.2.6?
Copy code
- Fixed a GraphQL resolution error which occurred when retrieving metadata for step failures in the event log.
d
Unlikely i think - we have some theories that we're running down though
👍 1
Any chance you could send us a link to the asset in the branch deployment that previously failed? we can use its ID to check some things in our logs
z
I’ll DM you it
🙏 1