Any thoughts here on this error? Running k8s execu...
# ask-community
Any thoughts here on this error? Running k8s executor dagit/dagster version 1.2.7
Copy code
Operation name: JobMetadataQuery

Message: Failure loading edgeshare: dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server

Stack Trace:
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/workspace/", line 535, in _load_location
    location = self._create_location_from_origin(origin)
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/workspace/", line 460, in _create_location_from_origin
    return origin.create_location()
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/host_representation/", line 329, in create_location
    return GrpcServerRepositoryLocation(self)
  File "/usr/local/lib/python3.7/site-packages/dagster/_core/host_representation/", line 606, in __init__
  File "/usr/local/lib/python3.7/site-packages/dagster/_api/", line 29, in sync_get_streaming_external_repositories_data_grpc
  File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/", line 336, in streaming_external_repository
  File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/", line 166, in _streaming_query
    raise DagsterUserCodeUnreachableError("Could not reach user code server") from e

The above exception was caused by the following exception:
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1682543922.819565004","description":"Deadline Exceeded","file":"src/core/ext/filters/deadline/","file_line":81,"grpc_status":4}"

Stack Trace:
  File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/", line 163, in _streaming_query
    method, request=request_type(**kwargs), timeout=timeout
  File "/usr/local/lib/python3.7/site-packages/dagster/_grpc/", line 152, in _get_streaming_response
    yield from getattr(stub, method)(request, metadata=self._metadata, timeout=timeout)
  File "/usr/local/lib/python3.7/site-packages/grpc/", line 426, in __next__
    return self._next()
  File "/usr/local/lib/python3.7/site-packages/grpc/", line 826, in _next
    raise self

Path: ["assetNodes"]

Locations: [{"line":10,"column":3}]
fwiw: a fresh resolves this but then it keeps on happening.
means it took longer than 60 seconds for the dagit webserver to fetch the workspace snapshot (representation of the definitions) from the code server via GRPC Do you have a very large workspace in on code location? many many jobs/ops/assets? Otherwise its possible limited resources are slowing things down. You can set env var
to change the timeout
Okay. got it. I have about 150 jobs/10 sensors running. I also make calls to the server via graphql query to refresh the repo. Yet I see this frequently
Copy code
Dagster Reload Response: {'data': {'reloadRepositoryLocation': {'__typename': 'WorkspaceLocationEntry', 'name': 'edgeshare', 'locationOrLoadError': {'__typename': 'PythonError', 'message': 'dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server
would this be the same thing going on?
you may need to fetch more of the error object to see the chained exception to see what the grpc status code is but i would speculate theres a good chance that its the same
👍 1
if you have the right
settings you can use a profiler like
to see whats taking the user code server so long
Okay yeh I'm only looking at response["errors"].. illl expose the whole body
how many ops/assets are in the 150 jobs ? is there any very large metadata attached to them?
there are some performance improvements in 1.3.2 coming out today/tomorrow that may help
eh, ~500 ops, no metadata and only a description.
I'm going to push a build and look at more of the response body
DAGSTER_GRPC_TIMEOUT_SECONDS increase may have helped but im still not 100% sure yet