Hi All, I am getting the following error with GRPC...
# ask-community
v
Hi All, I am getting the following error with GRPC when trying to start dagster daemon and dagit. Anyone faced this before? Or any known fixes/workarounds?
Copy code
C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\core\workspace\context.py:558: UserWarning: Error loading repository location repository_local.py:Exception: Timed out waiting for gRPC server to start with arguments: "C:\Users\<username>\.conda\envs\de2\python.exe -m dagster api grpc --lazy-load-user-code --port 62852 --heartbeat --heartbeat-timeout 45 --fixed-server-id b5a3bd23-2d42-42b1-bc52-55ecab363695 --log-level WARNING --use-python-environment-entry-point -f kh_dagster/repository_local.py -d D:\Projects\DE2". Most recent connection error: dagster.core.errors.DagsterUserCodeUnreachableError: Could not reach user code server

Stack Trace:
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\grpc\server.py", line 936, in wait_for_grpc_server
    client.ping("")
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\grpc\client.py", line 123, in ping
    res = self._query("Ping", api_pb2.PingRequest, echo=echo)
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\grpc\client.py", line 110, in _query
    raise DagsterUserCodeUnreachableError("Could not reach user code server") from e

The above exception was caused by the following exception:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1647868799.976000000","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3129,"referenced_errors":[{"created":"@1647868799.976000000","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}"
>

Stack Trace:
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\grpc\client.py", line 107, in _query
    response = getattr(stub, method)(request_type(**kwargs), timeout=timeout)
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\grpc\_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\grpc\_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)


Stack Trace:
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\core\host_representation\grpc_server_registry.py", line 207, in _get_grpc_endpoint
    server_process = GrpcServerProcess(
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\grpc\server.py", line 1082, in __init__
    self.server_process, self.port = open_server_process_on_dynamic_port(
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\grpc\server.py", line 1030, in open_server_process_on_dynamic_port
    server_process = open_server_process(
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\grpc\server.py", line 1007, in open_server_process
    wait_for_grpc_server(server_process, client, subprocess_args, timeout=startup_timeout)
  File "C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\grpc\server.py", line 942, in wait_for_grpc_server
    raise Exception(

  warnings.warn(
C:\Users\<username>\.conda\envs\de2\lib\site-packages\dagster\core\execution\compute_logs.py:42: UserWarning: WARNING: Compute log capture is disabled for the current environment. Set the environment variable `PYTHONLEGACYWINDOWSSTDIO` to enable.

  warnings.warn(WIN_PY36_COMPUTE_LOG_DISABLED_MSG)
2022-03-21 18:50:00 +0530 - dagit - INFO - Serving dagit on <http://127.0.0.1:3000> in process 25660
d
Hi Vignesh - if you run that command yourself that dagster tried to run, do you get any errors?
C:\Users\<username>\.conda\envs\de2\python.exe -m dagster api grpc --lazy-load-user-code --port 62852 --heartbeat --heartbeat-timeout 45 --fixed-server-id b5a3bd23-2d42-42b1-bc52-55ecab363695 --log-level WARNING --use-python-environment-entry-point -f kh_dagster/repository_local.py -d D:\Projects\DE2
v
this just loads the plugins and closes the process. the daemon stops after some time
d
the intended behavior if you run that command is that it spins up a long running process that dagster can use to load your code. When you say it 'closes the process' that sounds unexpected, could you say more about that?
Is it possible to paste or DM the raw output of the process when you run that command?
Sometimes we see this error when the machine is so overloaded for whatever reason that it takes more than a minute to spin up a new server. I've also seen it in environments where the networking setup was so restricted that servers weren't allowed to run even on the local host
v
Hi @daniel, PFA the output when I run the command. The control is handed back to prompt after last row. Process doesn't keep running.
d
One thing that jumps out from that output is the long gap of time between 141723 and 141834 (more than a minute) which is typically much longer than it takes to load your jobs (and would account for the timeout). I also see the "Azure Active Directory" error that looks suspicious. Are there any clues in the code that might explain why its taking so long to start up, any idea what it might be doing between 141723 and 141834 that might be taking so long? Typically loading your modules wouldn't take more than a few seconds
v
Sorry to have not mentioned, the Azure Active Directory error keeps repeating between those times. Had some file size limitation issues, so had it trimmed off
d
The expected behavior is that it starts up a server and sits, with a final line that looks something like this:
Copy code
2022-03-24 09:17:36 -0500 - dagster.code_server - INFO - Started Dagster code server for module dagster_test.toys.repo on port 4000 in process 2048
I suspect if you tried it with a simple hello world repo or something, that's what you'd see - and something specific about your job code is causing the code loading to be very slow and/or crash
v
ok, will try with a sample project once
d
In particular the Azure Active Directory error is very likely to be something specific to your code/files rather than something in dagster
v
Hi @daniel, I have fixed the Azure AD error and it seems to be working now for the most part. It still crashes occasionally, but when i reload the repo from workspaces, it starts working again. Thanks 🙂
condagster 1