I m attempting to do a backfill which was working fine for a dagster #dagster-plus

I'm attempting to do a backfill, which was working...

Arvind Narayanan

04/20/2023, 9:09 PM

I'm attempting to do a backfill, which was working fine for a bit until I started hitting the error in the thread, which I see whenever I try and resume the backfill or start a new backfill.

Arvind Narayanan

04/20/2023, 9:09 PM

Copy code

dagster._core.errors.DagsterUserCodeUnreachableError: dagster._core.errors.DagsterUserCodeUnreachableError: User code server request timed out due to taking longer than 60 seconds to complete.

Stack Trace:
  File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 814, in _process_api_request
    api_result = self._handle_api_request(
  File "/dagster-cloud/dagster_cloud/agent/dagster_cloud_agent.py", line 639, in _handle_api_request
    client.external_partition_set_execution_params(
  File "/dagster/dagster/_grpc/client.py", line 276, in external_partition_set_execution_params
    chunks = list(
  File "/dagster/dagster/_grpc/client.py", line 186, in _streaming_query
    self._raise_grpc_exception(
  File "/dagster/dagster/_grpc/client.py", line 137, in _raise_grpc_exception
    raise DagsterUserCodeUnreachableError(

The above exception was caused by the following exception:
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1682024204.371008147","description":"Deadline Exceeded","file":"src/core/ext/filters/deadline/deadline_filter.cc","file_line":81,"grpc_status":4}"
>

Stack Trace:
  File "/dagster/dagster/_grpc/client.py", line 182, in _streaming_query
    yield from self._get_streaming_response(
  File "/dagster/dagster/_grpc/client.py", line 171, in _get_streaming_response
    yield from getattr(stub, method)(request, metadata=self._metadata, timeout=timeout)
  File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 426, in __next__
    return self._next()
  File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 826, in _next
    raise self

  File "/dagster/dagster/_daemon/backfill.py", line 36, in execute_backfill_iteration
    yield from execute_job_backfill_iteration(
  File "/dagster/dagster/_core/execution/job_backfill.py", line 74, in execute_job_backfill_iteration
    for _run_id in submit_backfill_runs(
  File "/dagster/dagster/_core/execution/job_backfill.py", line 197, in submit_backfill_runs
    result = code_location.get_external_partition_set_execution_param_data(
  File "/dagster-cloud-backend/dagster_cloud_backend/user_code/workspace.py", line 605, in get_external_partition_set_execution_param_data
    result = self.api_call(
  File "/dagster-cloud-backend/dagster_cloud_backend/user_code/workspace.py", line 382, in api_call
    return dagster_cloud_api_call(
  File "/dagster-cloud-backend/dagster_cloud_backend/user_code/workspace.py", line 131, in dagster_cloud_api_call
    for result in gen_dagster_cloud_api_call(
  File "/dagster-cloud-backend/dagster_cloud_backend/user_code/workspace.py", line 280, in gen_dagster_cloud_api_call
    raise DagsterUserCodeUnreachableError(error_infos[0].to_string())

jordan

04/20/2023, 9:37 PM

Can you check the logs of your grpc server and describe its pods? The server is usually unreachable if it’s having trouble initializing or if it’s exhausting its allotted resourced and frequently getting killed by k8s.

Arvind Narayanan

04/20/2023, 9:48 PM

I was originally seeing a graphql max retries exceeded error, but it seems to be working now

5 Views

Open in Slack

Previous Next