https://dagster.io/ logo
Title
g

George Pearse

01/21/2022, 8:45 AM
I'm receiving the error.
dagster.core.errors.DagsterInvariantViolationError: research_and_development not found at module scope in file repo.py.
Only when my pipeline is triggered by a schedule. It runs fine otherwise. I have absolutely no idea what code change has caused this, it's been working smoothly for months (I am still assuming I've broken something with a code change of course).
Works fine when executed with
dagit -p 4000 -f repo.py -h 0.0.0.0
But fails when scheduled within my docker-compose set-up
Works when executed via the UI
Can I run dagit from inside the container for testing?
Trying with pin to 0.13.13 as mentioned/discussed in another thread
Keep getting warnings of version skew, but my version is definitely pinned across all the dagster dependencies
2022-01-21 13:30:16 - SchedulerDaemon - ERROR - Scheduler caught an error for schedule cxr_lake_schedule : dagster.serdes.errors.DeserializationError: Attempted to deserialize class "TickData" which is not in the whitelist. This error can occur due to version skew, verify processes are running expected versions.
docker_dev_dagster_daemon       | Descent path: <root:dict>
docker_dev_dagster_daemon       | 
docker_dev_dagster_daemon       | Stack Trace:
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/scheduler/scheduler.py", line 96, in launch_scheduled_runs
docker_dev_dagster_daemon       |     (debug_crash_flags.get(schedule_state.job_name) if debug_crash_flags else None),
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/scheduler/scheduler.py", line 121, in launch_scheduled_runs_for_schedule
docker_dev_dagster_daemon       |     latest_tick = instance.get_latest_job_tick(schedule_state.job_origin_id)
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 1579, in get_latest_job_tick
docker_dev_dagster_daemon       |     return self._schedule_storage.get_latest_job_tick(job_origin_id)
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/core/storage/schedules/sql_schedule_storage.py", line 136, in get_latest_job_tick
docker_dev_dagster_daemon       |     return JobTick(rows[0][0], deserialize_json_to_dagster_namedtuple(rows[0][1]))
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 339, in deserialize_json_to_dagster_namedtuple
docker_dev_dagster_daemon       |     check.str_param(json_str, "json_str"), whitelist_map=_WHITELIST_MAP
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 362, in _deserialize_json
docker_dev_dagster_daemon       |     return unpack_inner_value(value, whitelist_map=whitelist_map, descent_path=_root(value))
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 393, in unpack_inner_value
docker_dev_dagster_daemon       |     f'Attempted to deserialize class "{klass_name}" which is not in the whitelist. '
I can execute via the UI but not via the schedule defined in repo.py
I think I must have been failing to properly rebuild the docker-compose service. Thus resulting in a mixture of versions across successive builds? Only thing I can think of. Been throwing everything at it. But most significant is probably running:
docker-compose down
Before
docker-compose up --build
And now:
docker_dev_dagster_daemon       | 2022-01-21 14:15:16 +0000 - dagster.daemon.SchedulerDaemon - ERROR - Scheduler caught an error for schedule cxr_lake_schedule : dagster.core.errors.DagsterUserCodeUnreachableError: Could not reach user code server
docker_dev_dagster_daemon       | 
docker_dev_dagster_daemon       | Stack Trace:
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/scheduler/scheduler.py", line 131, in launch_scheduled_runs
docker_dev_dagster_daemon       |     repo_location = workspace.get_location(origin)
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/core/workspace/dynamic_workspace.py", line 36, in get_location
docker_dev_dagster_daemon       |     location = existing_location if existing_location else origin.create_location()
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/core/host_representation/origin.py", line 266, in create_location
docker_dev_dagster_daemon       |     return GrpcServerRepositoryLocation(self)
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/core/host_representation/repository_location.py", line 528, in __init__
docker_dev_dagster_daemon       |     list_repositories_response = sync_list_repositories_grpc(self.client)
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/api/list_repositories.py", line 14, in sync_list_repositories_grpc
docker_dev_dagster_daemon       |     deserialize_json_to_dagster_namedtuple(api_client.list_repositories()),
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/grpc/client.py", line 162, in list_repositories
docker_dev_dagster_daemon       |     res = self._query("ListRepositories", api_pb2.ListRepositoriesRequest)
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/grpc/client.py", line 108, in _query
docker_dev_dagster_daemon       |     raise DagsterUserCodeUnreachableError("Could not reach user code server") from e
docker_dev_dagster_daemon       | 
docker_dev_dagster_daemon       | The above exception was caused by the following exception:
docker_dev_dagster_daemon       | grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
docker_dev_dagster_daemon       | 	status = StatusCode.UNAVAILABLE
docker_dev_dagster_daemon       | 	details = "failed to connect to all addresses"
docker_dev_dagster_daemon       | 	debug_error_string = "{"created":"@1642774516.110812575","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3134,"referenced_errors":[{"created":"@1642774516.110809952","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}"
docker_dev_dagster_daemon       | >
docker_dev_dagster_daemon       | 
docker_dev_dagster_daemon       | Stack Trace:
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/dagster/grpc/client.py", line 105, in _query
docker_dev_dagster_daemon       |     response = getattr(stub, method)(request_type(**kwargs), timeout=timeout)
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/grpc/_channel.py", line 946, in __call__
docker_dev_dagster_daemon       |     return _end_unary_response_blocking(state, call, False, None)
docker_dev_dagster_daemon       |   File "/usr/local/lib/python3.7/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
docker_dev_dagster_daemon       |     raise _InactiveRpcError(state)
d

daniel

01/21/2022, 2:33 PM
Hi George- seems like a bunch of different things have been happening here.. is the original report of a repository not being found still relevant? Are you running your code in a separate server, or including the code in the dagit and daemon pods?
g

George Pearse

01/21/2022, 2:41 PM
One server for daemon, pipeline and dagit. Postgres is hosted elsewhere. Run with docker-compose. All components pinned at 0.13.0. Current error is:
docker_dev_dagster_dagit        |     f'Attempted to deserialize class "{klass_name}" which is not in the whitelist. '
docker_dev_dagster_dagit        | dagster.serdes.errors.DeserializationError: Attempted to deserialize class "TickData" which is not in the whitelist. This error can occur due to version skew, verify processes are running expected versions.
docker_dev_dagster_dagit        | Descent path: <root:dict>
docker_dev_dagster_dagit        | An error occurred while resolving field InstigationState.ticks
docker_dev_dagster_dagit        | Traceback (most recent call last):
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/flask_sockets.py", line 41, in __call__
docker_dev_dagster_dagit        |     environment = environ['wsgi.websocket']
docker_dev_dagster_dagit        | KeyError: 'wsgi.websocket'
docker_dev_dagster_dagit        | 
docker_dev_dagster_dagit        | During handling of the above exception, another exception occurred:
docker_dev_dagster_dagit        | 
docker_dev_dagster_dagit        | Traceback (most recent call last):
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/executor.py", line 452, in resolve_or_error
docker_dev_dagster_dagit        |     return executor.execute(resolve_fn, source, info, **args)
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/executors/sync.py", line 16, in execute
docker_dev_dagster_dagit        |     return fn(*args, **kwargs)
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster_graphql/schema/instigation.py", line 333, in resolve_ticks
docker_dev_dagster_dagit        |     self._job_state.job_origin_id, before=before, after=after, limit=limit
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 1575, in get_job_ticks
docker_dev_dagster_dagit        |     job_origin_id, before=before, after=after, limit=limit
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/core/storage/schedules/sql_schedule_storage.py", line 168, in get_job_ticks
docker_dev_dagster_dagit        |     map(lambda r: JobTick(r[0], deserialize_json_to_dagster_namedtuple(r[1])), rows)
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/core/storage/schedules/sql_schedule_storage.py", line 168, in <lambda>
docker_dev_dagster_dagit        |     map(lambda r: JobTick(r[0], deserialize_json_to_dagster_namedtuple(r[1])), rows)
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 339, in deserialize_json_to_dagster_namedtuple
docker_dev_dagster_dagit        |     check.str_param(json_str, "json_str"), whitelist_map=_WHITELIST_MAP
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 362, in _deserialize_json
docker_dev_dagster_dagit        |     return unpack_inner_value(value, whitelist_map=whitelist_map, descent_path=_root(value))
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 393, in unpack_inner_value
docker_dev_dagster_dagit        |     f'Attempted to deserialize class "{klass_name}" which is not in the whitelist. '
docker_dev_dagster_dagit        | dagster.serdes.errors.DeserializationError: Attempted to deserialize class "TickData" which is not in the whitelist. This error can occur due to version skew, verify processes are running expected versions.
docker_dev_dagster_dagit        | Descent path: <root:dict>
docker_dev_dagster_dagit        | An error occurred while resolving field InstigationState.ticks
docker_dev_dagster_dagit        | Traceback (most recent call last):
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/flask_sockets.py", line 41, in __call__
docker_dev_dagster_dagit        |     environment = environ['wsgi.websocket']
docker_dev_dagster_dagit        | KeyError: 'wsgi.websocket'
docker_dev_dagster_dagit        | 
docker_dev_dagster_dagit        | During handling of the above exception, another exception occurred:
docker_dev_dagster_dagit        | 
docker_dev_dagster_dagit        | Traceback (most recent call last):
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/executor.py", line 452, in resolve_or_error
docker_dev_dagster_dagit        |     return executor.execute(resolve_fn, source, info, **args)
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/executors/sync.py", line 16, in execute
docker_dev_dagster_dagit        |     return fn(*args, **kwargs)
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster_graphql/schema/instigation.py", line 333, in resolve_ticks
docker_dev_dagster_dagit        |     self._job_state.job_origin_id, before=before, after=after, limit=limit
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 1575, in get_job_ticks
docker_dev_dagster_dagit        |     job_origin_id, before=before, after=after, limit=limit
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/core/storage/schedules/sql_schedule_storage.py", line 168, in get_job_ticks
docker_dev_dagster_dagit        |     map(lambda r: JobTick(r[0], deserialize_json_to_dagster_namedtuple(r[1])), rows)
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/core/storage/schedules/sql_schedule_storage.py", line 168, in <lambda>
docker_dev_dagster_dagit        |     map(lambda r: JobTick(r[0], deserialize_json_to_dagster_namedtuple(r[1])), rows)
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 339, in deserialize_json_to_dagster_namedtuple
docker_dev_dagster_dagit        |     check.str_param(json_str, "json_str"), whitelist_map=_WHITELIST_MAP
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 362, in _deserialize_json
docker_dev_dagster_dagit        |     return unpack_inner_value(value, whitelist_map=whitelist_map, descent_path=_root(value))
docker_dev_dagster_dagit        |   File "/usr/local/lib/python3.7/site-packages/dagster/serdes/serdes.py", line 393, in unpack_inner_value
docker_dev_dagster_dagit        |     f'Attempted to deserialize class "{klass_name}" which is not in the whitelist. '
docker_dev_dagster_dagit        | dagster.serdes.errors.DeserializationError: Attempted to deserialize class "TickData" which is not in the whitelist. This error can occur due to version skew, verify processes are running expected versions.
docker_dev_dagster_dagit        | Descent path: <root:dict>
d

daniel

01/21/2022, 2:41 PM
Did you run
dagster instance migrate
after upgrading to a new dagster version?
(depending on what your previous version was before upgrading, that could be neccesary)
g

George Pearse

01/21/2022, 2:51 PM
How do I run dagster instance migrate when the postgres instance is fully separate?
Traceback (most recent call last):
  File "/home/dagster/.local/lib/python3.9/site-packages/dagster/serdes/config_class.py", line 56, in rehydrate
    module = importlib.import_module(self.module_name)
  File "/usr/lib64/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 972, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'dagster_postgres'
d

daniel

01/21/2022, 3:13 PM
Can you use
docker exec
to run it within the dagit container?
❤️ 1
g

George Pearse

01/21/2022, 4:59 PM
Thanks a lot for your time and help. The real problem was the mismatch in versions (which docker exec + dagster instance migrate helped me to see). Resolved that, docker exec'd back in. Ran dagster instance migrate and I think we're good. Thanks so much for deciphering my panicked messages! Lone data engineer away for a week from Monday.
d

daniel

01/21/2022, 4:59 PM
sweet! glad it worked out