https://dagster.io/ logo
Title
a

Ahmed Slaoui

11/07/2022, 2:12 PM
Hi guys ! I'm stuck with a
DagsterUserCodeUnreachableError: Could not reach user code server
with a
"failed to connect to all addresses"
whenever I run backfills that generate a lot of job runs. I have a daily partitioned job with 13 independent assets, and I'm attempting to backfill 1-2 years of daily runs (couple of hundreds of partitions). The asset materializations take a couple of minutes to execute. A few minutes after launching the backfill, the
Backfill status
shows
Failed
with the following error in the thread. Only the runs that managed to get queued before the backfill status error get executed (maybe 10% of the partitions). The error code points to a timeout issue so I attempted to increase the
local_startup_timeout
to 600 seconds, with no effect .. Note: We're running Dagster as a service locally with Postgres storage, but the issue was present with the default Sqlite storage as well. Any idea ?
Error message when clicking on
Backfill status: Failed
dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server
  File "E:\code\tasks-runner\venv\lib\site-packages\dagster\_daemon\backfill.py", line 95, in execute_backfill_iteration
    for _run_id in submit_backfill_runs(
  File "E:\code\tasks-runner\venv\lib\site-packages\dagster\_core\execution\backfill.py", line 196, in submit_backfill_runs
    pipeline_run = create_backfill_run(
  File "E:\code\tasks-runner\venv\lib\site-packages\dagster\_core\execution\backfill.py", line 289, in create_backfill_run
    external_execution_plan = repo_location.get_external_execution_plan(
  File "E:\code\tasks-runner\venv\lib\site-packages\dagster\_core\host_representation\repository_location.py", line 706, in get_external_execution_plan
    execution_plan_snapshot_or_error = sync_get_external_execution_plan_grpc(
  File "E:\code\tasks-runner\venv\lib\site-packages\dagster\_api\snapshot_execution_plan.py", line 46, in sync_get_external_execution_plan_grpc
    api_client.execution_plan_snapshot(
  File "E:\code\tasks-runner\venv\lib\site-packages\dagster\_grpc\client.py", line 159, in execution_plan_snapshot
    res = self._query(
  File "E:\code\tasks-runner\venv\lib\site-packages\dagster\_grpc\client.py", line 115, in _query
    raise DagsterUserCodeUnreachableError("Could not reach user code server") from e
The above exception was caused by the following exception:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"created":"@1667821417.237000000","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3261,"referenced_errors":[{"created":"@1667821417.237000000","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
>
  File "E:\code\tasks-runner\venv\lib\site-packages\dagster\_grpc\client.py", line 112, in _query
    response = getattr(stub, method)(request_type(**kwargs), timeout=timeout)
  File "E:\code\tasks-runner\venv\lib\site-packages\grpc\_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "E:\code\tasks-runner\venv\lib\site-packages\grpc\_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
a

Alexis Manuel

11/29/2022, 1:38 PM
@Ahmed Slaoui Did you manage to solve this ? I am encountering the same issue
a

Ahmed Slaoui

12/01/2022, 5:30 PM
Nop still encoutering the same issue, even as we moved to PotgresSQL storage 😕
a

Arthur

01/08/2023, 2:17 AM
got the same message even worse in 1.1.9 since you cant resume failed backfill
apparently theres a grpc timeout which can be set tho
r

Rafael Gomes

01/10/2023, 7:29 PM
Any updates on this?
a

Arthur

01/10/2023, 7:40 PM
nope. But if you havent updated to latest version make sure you keep it that way else it gets worse on backfills since you cant resume
r

Rafael Gomes

01/10/2023, 7:45 PM
Thanks for the heads up @Arthur. I'm still running on
1.1.3
and was planning to upgrade to the latest version.
👍 1
a

Arthur

01/10/2023, 7:45 PM
i'd do max 1.1.7 if you're looking for new features
1