Casper Weiss Bang
01/02/2023, 1:00 PMgrpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.DEADLINE_EXCEEDED
details = "Deadline Exceeded"
debug_error_string = "{"created":"@1672664343.554830497","description":"Deadline Exceeded","file":"src/core/ext/filters/deadline/deadline_filter.cc","file_line":81,"grpc_status":4}"
>
What do i do here?advance_all_cursors
daniel
01/02/2023, 5:13 PMCasper Weiss Bang
01/02/2023, 5:31 PMdaniel
01/02/2023, 5:35 PMCasper Weiss Bang
01/02/2023, 7:01 PMdaniel
01/02/2023, 7:32 PMCasper Weiss Bang
01/02/2023, 7:35 PMclaire
01/03/2023, 5:20 PMCasper Weiss Bang
01/04/2023, 10:46 AMclaire
01/11/2023, 5:53 PMdagster instance migrate
. If you have access to your postgres db, you can check which indexes are present. These two indexes were the ones added in 1.0.15 which should speed up this query: https://github.com/dagster-io/dagster/blob/61eeaa75909b1d580022b01d254c3b4f25555bf3/python_modules/dagster/dagster/_core/storage/event_log/schema.py#L101-L122Casper Weiss Bang
01/12/2023, 8:09 AMUpdating run storage...
Skipping already applied data migration: run_partitions
Skipping already applied data migration: run_repo_label_tags
Skipping already applied data migration: bulk_action_types
Updating event storage...
Skipping already applied data migration: asset_key_index_columns
Updating schedule storage...
Skipping already applied migration: schedule_jobs_selector_id
seems promising!dagster._core.errors.DagsterUserCodeUnreachableError: The sensor tick timed out due to taking longer than 60 seconds to execute the sensor function. One way to avoid this error is to break up the sensor work into chunks, using cursors to let subsequent sensor calls pick up where the previous call left off.
File "/usr/local/lib/python3.10/site-packages/dagster/_daemon/sensor.py", line 489, in _process_tick_generator
yield from _evaluate_sensor(
File "/usr/local/lib/python3.10/site-packages/dagster/_daemon/sensor.py", line 552, in _evaluate_sensor
sensor_runtime_data = repo_location.get_external_sensor_execution_data(
File "/usr/local/lib/python3.10/site-packages/dagster/_core/host_representation/repository_location.py", line 823, in get_external_sensor_execution_data
return sync_get_external_sensor_execution_data_grpc(
File "/usr/local/lib/python3.10/site-packages/dagster/_api/snapshot_sensor.py", line 63, in sync_get_external_sensor_execution_data_grpc
api_client.external_sensor_execution(
File "/usr/local/lib/python3.10/site-packages/dagster/_grpc/client.py", line 398, in external_sensor_execution
chunks = list(
File "/usr/local/lib/python3.10/site-packages/dagster/_grpc/client.py", line 186, in _streaming_query
self._raise_grpc_exception(
File "/usr/local/lib/python3.10/site-packages/dagster/_grpc/client.py", line 137, in _raise_grpc_exception
raise DagsterUserCodeUnreachableError(
The above exception was caused by the following exception:
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.DEADLINE_EXCEEDED
details = "Deadline Exceeded"
debug_error_string = "{"created":"@1675673483.967854180","description":"Deadline Exceeded","file":"src/core/ext/filters/deadline/deadline_filter.cc","file_line":81,"grpc_status":4}"
>
File "/usr/local/lib/python3.10/site-packages/dagster/_grpc/client.py", line 182, in _streaming_query
yield from self._get_streaming_response(
File "/usr/local/lib/python3.10/site-packages/dagster/_grpc/client.py", line 171, in _get_streaming_response
yield from getattr(stub, method)(request, metadata=self._metadata, timeout=timeout)
File "/usr/local/lib/python3.10/site-packages/grpc/_channel.py", line 426, in __next__
return self._next()
File "/usr/local/lib/python3.10/site-packages/grpc/_channel.py", line 826, in _next
raise self
owen
02/06/2023, 5:57 PMclaire
02/06/2023, 5:58 PMCasper Weiss Bang
02/06/2023, 5:59 PMMonthlyPartitionsDefinition(start_date="2019-01-01", end_offset=1)
.
I can also see i 'only' have 5790 runs records. (I deleted 2000~ to see if it helped.. it didn't seem to, now with only 3762 runs).
I have three code locations (well repositories) and one of them occationally does succeed, the other two always times out.
The database is a GP_Gen5_2
from azure.
It's troublesome to debug with you guys due to timezones, but i'll try to gather as much intel for you awesome people for tonight. I can add a github issue too, if that is better for you 🙂owen
02/07/2023, 5:10 PMimport os
from dagster import build_asset_reconciliation_sensor, build_sensor_context, DagsterInstance
from my_package import my_repo
if __name__ == "__main__":
# you may need to change this line to get your prod dagster instance
with DagsterInstance.get() as instance:
sensor = build_asset_reconciliation_sensor(asset_selection=AssetSelection.all())
cursor = sensor.evaluate_tick(
build_sensor_context(
instance=instance,
repository_def=my_repo,
)
)
this cuts to the chase a bit, allowing you to directly run a tick of the sensor. From there, you can run time sudo py-spy record -o profile.svg -- python that_file.py
, which will generate a flame graph (which would be super helpful in determining what's slowing things down)Casper Weiss Bang
02/07/2023, 5:20 PMowen
02/07/2023, 5:24 PMCasper Weiss Bang
02/14/2023, 7:45 AMdagster._core.errors.DagsterUserCodeUnreachableError: User code server request timed out due to taking longer than 60 seconds to complete.This is when running a backfill, and it's a clean server so it shouldn't have a bunch of historic values, which i assumed was the error on the dev server (the one i reported the other error on) I have the profile.svg but would prfer not to push this to a public slack channel. Do you have an email i can ping? :)
owen
02/14/2023, 5:08 PM