Grigoriy Sterin
04/19/2022, 2:25 PMdagster.core.errors.DagsterExecutionStepExecutionError: Error occurred while executing op...
.
And here goes the most interesting part: underlying ECS jobs keep running and Dagster job also keeps running forever (Dagit shows it's status as STARTED
), but nothing actually happens in this job, since it has already failed (but somehow Dagster doesn't know about it).
Has anyone seen something like this before?
Dagster version: 0.14.7
johann
04/19/2022, 4:05 PMGrigoriy Sterin
04/19/2022, 4:30 PMjohann
04/19/2022, 4:34 PMGrigoriy Sterin
04/19/2022, 4:35 PM2022-04-18T09:44:23.810-04:00 ./pipelines/data_ingestion_assets/assets/novogene_data_ingestion_assets.py:28: ExperimentalWarning: "asset" is an experimental decorator. It may break in future versions, even between dot releases. To mute warnings for experimental functionality, invoke warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning) or use one of the other methods described at <https://docs.python.org/3/library/warnings.html#describing-warning-filters>.
2022-04-18T09:44:23.810-04:00 metadata=RUN_METADATA
2022-04-18T09:44:23.810-04:00 ./pipelines/data_ingestion_assets/assets/novogene_data_ingestion_assets.py:45: ExperimentalWarning: "asset" is an experimental decorator. It may break in future versions, even between dot releases. To mute warnings for experimental functionality, invoke warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning) or use one of the other methods described at <https://docs.python.org/3/library/warnings.html#describing-warning-filters>.
2022-04-18T09:44:23.810-04:00 @asset
2022-04-18T09:44:23.810-04:00 ./pipelines/data_ingestion_assets/assets/novogene_data_ingestion_assets.py:55: ExperimentalWarning: "asset" is an experimental decorator. It may break in future versions, even between dot releases. To mute warnings for experimental functionality, invoke warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning) or use one of the other methods described at <https://docs.python.org/3/library/warnings.html#describing-warning-filters>.
2022-04-18T09:44:23.810-04:00 metadata=RUN_METADATA
2022-04-18T09:44:23.810-04:00 2022-04-18 13:44:23 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 1 - RUN_START - Started execution of run for "data_ingestion_assets".
2022-04-18T09:44:23.827-04:00 2022-04-18 13:44:23 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 1 - ENGINE_EVENT - Executing steps using multiprocess executor: parent process (pid: 1)
2022-04-18T09:44:23.910-04:00 2022-04-18 13:44:23 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 1 - new_novogene_data_folders_in_aggregation_bucket - ENGINE_EVENT - Launching subprocess for new_novogene_data_folders_in_aggregation_bucket
2022-04-18T09:44:28.635-04:00 ./pipelines/data_ingestion_assets/assets/novogene_data_ingestion_assets.py:28: ExperimentalWarning: "asset" is an experimental decorator. It may break in future versions, even between dot releases. To mute warnings for experimental functionality, invoke warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning) or use one of the other methods described at <https://docs.python.org/3/library/warnings.html#describing-warning-filters>.
2022-04-18T09:44:28.635-04:00 metadata=RUN_METADATA
2022-04-18T09:44:28.635-04:00 ./pipelines/data_ingestion_assets/assets/novogene_data_ingestion_assets.py:45: ExperimentalWarning: "asset" is an experimental decorator. It may break in future versions, even between dot releases. To mute warnings for experimental functionality, invoke warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning) or use one of the other methods described at <https://docs.python.org/3/library/warnings.html#describing-warning-filters>.
2022-04-18T09:44:28.635-04:00 @asset
2022-04-18T09:44:28.635-04:00 ./pipelines/data_ingestion_assets/assets/novogene_data_ingestion_assets.py:55: ExperimentalWarning: "asset" is an experimental decorator. It may break in future versions, even between dot releases. To mute warnings for experimental functionality, invoke warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning) or use one of the other methods described at <https://docs.python.org/3/library/warnings.html#describing-warning-filters>.
2022-04-18T09:44:28.635-04:00 metadata=RUN_METADATA
2022-04-18T09:44:28.635-04:00 2022-04-18 13:44:28 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 12 - new_novogene_data_folders_in_aggregation_bucket - ENGINE_EVENT - Starting initialization of resources [root_manager, s3_aggregation_bucket_io_manager, sqs_source_asset_io_manager].
2022-04-18T09:44:28.816-04:00 2022-04-18 13:44:28 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 12 - new_novogene_data_folders_in_aggregation_bucket - ENGINE_EVENT - Finished initialization of resources [root_manager, s3_aggregation_bucket_io_manager, sqs_source_asset_io_manager].
2022-04-18T09:44:29.896-04:00 2022-04-18 13:44:29 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 12 - new_novogene_data_folders_in_aggregation_bucket - LOGS_CAPTURED - Started capturing logs for step: new_novogene_data_folders_in_aggregation_bucket.
2022-04-18T09:44:29.896-04:00 2022-04-18 13:44:29 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 12 - new_novogene_data_folders_in_aggregation_bucket - STEP_START - Started execution of step "new_novogene_data_folders_in_aggregation_bucket".
2022-04-18T09:44:29.896-04:00 2022-04-18 13:44:29 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 12 - new_novogene_data_folders_in_aggregation_bucket - LOADED_INPUT - Loaded input "new_folders_in_novogene_landing_bucket" using input manager "root_manager"
2022-04-18T09:44:30.896-04:00 2022-04-18 13:44:29 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 12 - new_novogene_data_folders_in_aggregation_bucket - STEP_INPUT - Got input "new_folders_in_novogene_landing_bucket" of type "NewLandingData". (Type check passed).
2022-04-18T09:44:30.896-04:00 2022-04-18 13:44:29 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 12 - new_novogene_data_folders_in_aggregation_bucket - STEP_OUTPUT - Yielded output "result" of type "NewAggregationData". (Type check passed).
2022-04-18T09:46:55.914-04:00 2022-04-18 13:46:55 +0000 - dagster - INFO - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - new_novogene_data_folders_in_aggregation_bucket - Skipping fvl58-novogene-data-landing/PLT314-317 - already exists in aggregation bucket.
2022-04-18T09:46:55.914-04:00 2022-04-18 13:46:55 +0000 - dagster - INFO - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - new_novogene_data_folders_in_aggregation_bucket - Writing the list of last moved folders in aggregation bucket to fvl58-novogene-data-landing/data_ingestion_assets-f502119f-b348-43fd-a2aa-0741f89fefb3.lst.
2022-04-18T09:46:57.095-04:00 2022-04-18 13:46:57 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 1 - new_novogene_submission_sets - ENGINE_EVENT - Launching subprocess for new_novogene_submission_sets
2022-04-18T09:47:03.435-04:00 ./pipelines/data_ingestion_assets/assets/novogene_data_ingestion_assets.py:28: ExperimentalWarning: "asset" is an experimental decorator. It may break in future versions, even between dot releases. To mute warnings for experimental functionality, invoke warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning) or use one of the other methods described at <https://docs.python.org/3/library/warnings.html#describing-warning-filters>.
2022-04-18T09:47:03.435-04:00 metadata=RUN_METADATA
2022-04-18T09:47:03.435-04:00 ./pipelines/data_ingestion_assets/assets/novogene_data_ingestion_assets.py:45: ExperimentalWarning: "asset" is an experimental decorator. It may break in future versions, even between dot releases. To mute warnings for experimental functionality, invoke warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning) or use one of the other methods described at <https://docs.python.org/3/library/warnings.html#describing-warning-filters>.
2022-04-18T09:47:03.435-04:00 @asset
2022-04-18T09:47:03.435-04:00 ./pipelines/data_ingestion_assets/assets/novogene_data_ingestion_assets.py:55: ExperimentalWarning: "asset" is an experimental decorator. It may break in future versions, even between dot releases. To mute warnings for experimental functionality, invoke warnings.filterwarnings("ignore", category=dagster.ExperimentalWarning) or use one of the other methods described at <https://docs.python.org/3/library/warnings.html#describing-warning-filters>.
2022-04-18T09:47:03.435-04:00 metadata=RUN_METADATA
2022-04-18T09:47:03.435-04:00 2022-04-18 13:47:03 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 26 - new_novogene_submission_sets - ENGINE_EVENT - Starting initialization of resources [s3_aggregation_bucket_io_manager, s3_novogene_submission_sets_io_manager].
2022-04-18T09:47:03.614-04:00 2022-04-18 13:47:03 +0000 - dagster - DEBUG - data_ingestion_assets - f502119f-b348-43fd-a2aa-0741f89fefb3 - 26 - new_novogene_submission_sets - ENGINE_EVENT - Finished initialization of resources [s3_aggregation_bucket_io_manager, s3_novogene_submission_sets_io_manager].
johann
04/19/2022, 4:36 PMGrigoriy Sterin
04/19/2022, 4:37 PMjohann
04/19/2022, 4:43 PMDagster Bot
04/19/2022, 4:44 PMalex
04/19/2022, 4:47 PMGrigoriy Sterin
04/19/2022, 4:49 PMnovogene_data_ingestion_job = AssetGroup(
assets=[new_novogene_data_folders_in_aggregation_bucket, new_novogene_submission_sets],
source_assets=[new_folders_in_novogene_landing_bucket],
resource_defs={
"s3_aggregation_bucket_io_manager": s3_aggregation_bucket_io_manager,
"sqs_source_asset_io_manager": sqs_source_asset_io_manager,
"s3_novogene_submission_sets_io_manager": s3_novogene_submission_sets_io_manager,
}
).build_job("data_ingestion_assets")
alex
04/19/2022, 4:55 PMGrigoriy Sterin
04/19/2022, 4:55 PMalex
04/21/2022, 6:05 PMGrigoriy Sterin
04/21/2022, 6:15 PMalex
04/21/2022, 6:16 PMGrigoriy Sterin
04/21/2022, 6:21 PMalex
04/21/2022, 6:50 PMGrigoriy Sterin
04/21/2022, 6:51 PMStarted
state rather than Running
. Could it be some race condition or some clause in the Dagster code like if (exception == true and jobStatus == "Running"): terminateJob()
alex
04/21/2022, 6:53 PMpy-spy dump -p <pid>
to see where its stuck?Grigoriy Sterin
04/21/2022, 6:53 PM1127 pts/2 Sl+ 0:13 /usr/bin/python3 /home/gsterin/.local/bin/dagit -f data_ingestion_assets/repo.py
1129 pts/2 Sl+ 0:04 /usr/bin/python3 -m dagster api grpc --lazy-load-user-code --socket /tmp/tmpzam80aki --heartbeat --heartbeat-timeout 45 --fixed-server-id 01eb010f-1058-41eb-b6a3-8a7a0ddb06da --log-level WARNING --use-python-environment-entry-point -f data_ingestion_assets/repo.py -d /mnt/c/projects/dagster-data-ingestion-pipelines-repo/pipelines
2288 pts/2 S+ 0:00 /usr/bin/python3 -c from multiprocessing.resource_tracker import main;main(13)
2289 pts/2 Sl+ 0:01 /usr/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=14, pipe_handle=16) --multiprocessing-fork
2351 pts/2 Sl+ 0:01 /usr/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=14, pipe_handle=17) --multiprocessing-fork
alex
04/21/2022, 6:56 PMExecuting steps using multiprocess executor: parent process (pid: 1)
at the topGrigoriy Sterin
04/21/2022, 6:59 PMProcess 2289: /usr/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=14, pipe_handle=16) --multiprocessing-fork
Python v3.8.10 (/usr/bin/python3.8)
Thread 2289 (idle): "MainThread"
poll (multiprocessing/popen_fork.py:27)
wait (multiprocessing/popen_fork.py:47)
join (multiprocessing/process.py:149)
execute_child_process_command (dagster/core/executor/child_process_executor.py:165)
execute_step_out_of_process (dagster/core/executor/multiprocess.py:323)
execute (dagster/core/executor/multiprocess.py:210)
pipeline_execution_iterator (dagster/core/execution/api.py:785)
__iter__ (dagster/core/execution/api.py:879)
core_execute_run (dagster/grpc/impl.py:91)
_run_in_subprocess (dagster/grpc/impl.py:168)
start_run_in_subprocess (dagster/grpc/impl.py:192)
run (multiprocessing/process.py:108)
_bootstrap (multiprocessing/process.py:315)
_main (multiprocessing/spawn.py:129)
spawn_main (multiprocessing/spawn.py:116)
<module> (<string>:1)
Thread 2294 (idle): "kill-on-event"
wait (multiprocessing/synchronize.py:261)
wait (multiprocessing/synchronize.py:349)
_kill_on_event (dagster/utils/__init__.py:357)
run (threading.py:870)
_bootstrap_inner (threading.py:932)
_bootstrap (threading.py:890)
Thread 2295 (idle): "QueueFeederThread"
wait (threading.py:302)
_feed (multiprocessing/queues.py:227)
run (threading.py:870)
_bootstrap_inner (threading.py:932)
_bootstrap (threading.py:890)
alex
04/21/2022, 7:01 PMExecuting step new_novogene_submission_sets in subprocess
Grigoriy Sterin
04/21/2022, 7:22 PMProcess 2351: /usr/bin/python3 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=14, pipe_handle=17) --multiprocessing-fork
Python v3.8.10 (/usr/bin/python3.8)
Thread 0x7FD090C9F740 (active): "MainThread"
_wait_for_tstate_lock (threading.py:1027)
join (threading.py:1011)
close (python_dynamodb_lock/python_dynamodb_lock.py:688)
__del__ (data_ingestion_assets/io_managers/s3_aggregation_bucket_io_manager.py:36)
Thread 2357 (idle): "kill-on-event"
wait (multiprocessing/synchronize.py:261)
wait (multiprocessing/synchronize.py:349)
_kill_on_event (dagster/utils/__init__.py:357)
run (threading.py:870)
_bootstrap_inner (threading.py:932)
_bootstrap (threading.py:890)
Thread 0x7FD086E21700 (active): "DynamoDBLockClient-HS-0dd9417a179b4812bb474c504ee3e360"
_send_heartbeat_loop (python_dynamodb_lock/python_dynamodb_lock.py:167)
run (threading.py:870)
_bootstrap_inner (threading.py:932)
_bootstrap (threading.py:890)
Thread 0x7FD086620700 (active): "DynamoDBLockClient-HC-0dd9417a179b4812bb474c504ee3e360"
_check_heartbeat_loop (python_dynamodb_lock/python_dynamodb_lock.py:284)
run (threading.py:870)
_bootstrap_inner (threading.py:932)
_bootstrap (threading.py:890)
data_ingestion_assets/io_managers/s3_aggregation_bucket_io_manager.py:36
line, the error doesn't reproduce anymore. Thank you for help with debugging thisalex
04/21/2022, 7:23 PM