https://dagster.io/ logo
#ask-community
Title
# ask-community
a

Abhinav Dhulipala

07/11/2023, 6:25 PM
HI guys, we have dagster deployed as a systemd service. Currently our sensors are running into the following problem
Copy code
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "failed to connect to all addresses; last error: UNKNOWN: unix:/tmp/tmp0m3vq4n8: No such file or directory"
	debug_error_string = "UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: unix:/tmp/tmp0m3vq4n8: No such file or directory {created_time:"2023-07-11T11:19:44.573357058-07:00", grpc_status:14}"
Here is our sensor config
Copy code
run_launcher:
  module: dagster
  class: DefaultRunLauncher
  config: {}
sensors:
  use_threads: true
  num_workers: 4
Does this mean that during sensor evaluation, we are running out of temporary storage? should I soft link the tmp dir? or something even more drastic? From the looks of it our /tmp storage isn't close to full, so I'm not even sure if it's tmp storage or something else. We are on the latest dagster btw (1.3.13)
Copy code
$ df -h /tmp
Filesystem           Size  Used Avail Use% Mounted on
<root fs>            250G   26G  213G  11% /
a

alex

07/11/2023, 7:53 PM
do you have more of the stack trace? are you just using
dagster dev
for what systemd is launching or something else?
a

Abhinav Dhulipala

07/12/2023, 5:57 PM
I believe I'm using the reccomended setup, with dagit and a dagster daemon running independently. In terms of a stack trace, here it is
Copy code
dagster._core.errors.DagsterUserCodeUnreachableError: Could not reach user code server. gRPC Error code: UNAVAILABLE

  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_daemon/sensor.py", line 536, in _process_tick_generator
    yield from _evaluate_sensor(
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_daemon/sensor.py", line 841, in _evaluate_sensor
    for run_request_result in gen_run_request_results:
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_daemon/sensor.py", line 826, in <lambda>
    submit_run_request = lambda run_request: _submit_run_request(
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_daemon/sensor.py", line 612, in _submit_run_request
    run = _get_or_create_sensor_run(
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_daemon/sensor.py", line 958, in _get_or_create_sensor_run
    return _create_sensor_run(
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_daemon/sensor.py", line 973, in _create_sensor_run
    external_execution_plan = code_location.get_external_execution_plan(
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_core/host_representation/code_location.py", line 735, in get_external_execution_plan
    execution_plan_snapshot_or_error = sync_get_external_execution_plan_grpc(
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_api/snapshot_execution_plan.py", line 47, in sync_get_external_execution_plan_grpc
    api_client.execution_plan_snapshot(
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_grpc/client.py", line 221, in execution_plan_snapshot
    res = self._query(
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_grpc/client.py", line 157, in _query
    self._raise_grpc_exception(
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_grpc/client.py", line 140, in _raise_grpc_exception
    raise DagsterUserCodeUnreachableError(

The above exception was caused by the following exception:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "failed to connect to all addresses; last error: UNKNOWN: unix:/tmp/tmp41yew5k4: No such file or directory"
	debug_error_string = "UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: unix:/tmp/tmp41yew5k4: No such file or directory {created_time:"2023-07-12T10:55:46.411615564-07:00", grpc_status:14}"
>

  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_grpc/client.py", line 155, in _query
    return self._get_response(method, request=request_type(**kwargs), timeout=timeout)
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/dagster/_grpc/client.py", line 130, in _get_response
    return getattr(stub, method)(request, metadata=self._metadata, timeout=timeout)
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/grpc/_channel.py", line 1030, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/etc/home/dagster/dagster/.venv/lib/python3.10/site-packages/grpc/_channel.py", line 910, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
/cc @Bryce Arden
a

alex

07/12/2023, 6:37 PM
ok I believe i have a work in progress PR that will fix this https://github.com/dagster-io/dagster/pull/14910 the cause is that the daemon is updating its code servers to pick up new changes and the reference that the threads have has gone stale causing this error (the /tmp directory referenced is the communication channel for a previous server) one workaround you could employ would be to manually run the code servers as part of your systemd set-up and have dagit and the daemon reference those servers
🎉 1
a

Abhinav Dhulipala

07/12/2023, 8:43 PM
This is awesome thank you!
c

Chang Hai Bin

09/25/2023, 6:50 AM
@alex Thanks for the PR change. Was this released for Dagster 1.3.14? Or 1.4.0? We are seeing some similar issues with Dagster accessing the "tmp" files
a

alex

09/25/2023, 2:10 PM
1.4.0
👍 1
2 Views