Casper Weiss Bang
04/03/2023, 6:46 AMdaniel
04/03/2023, 8:34 PMCasper Weiss Bang
04/04/2023, 3:55 PMdaniel
04/04/2023, 3:56 PMdaniel
04/04/2023, 3:56 PMCasper Weiss Bang
04/04/2023, 3:57 PMraceback (most recent call last):
File "/usr/local/bin/dagster", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/site-packages/dagster/_cli/__init__.py", line 46, in main
cli(auto_envvar_prefix=ENV_PREFIX) # pylint:disable=E1123
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/dagster/_cli/api.py", line 73, in execute_run_command
return_code = _execute_run_command_body(
File "/usr/local/lib/python3.10/site-packages/dagster/_cli/api.py", line 150, in _execute_run_command_body
instance.report_engine_event(
File "/usr/local/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 1877, in report_engine_event
self.report_dagster_event(dagster_event, run_id=run_id, log_level=log_level)
File "/usr/local/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 1901, in report_dagster_event
self.handle_new_event(event_record)
File "/usr/local/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 1816, in handle_new_event
self._event_storage.store_event(event)
File "/usr/local/lib/python3.10/site-packages/dagster_postgres/event_log/event_log.py", line 175, in store_event
with self._connect() as conn:
File "/usr/local/lib/python3.10/contextlib.py", line 135, in __enter__
return next(self.gen)
File "/usr/local/lib/python3.10/site-packages/dagster_postgres/utils.py", line 166, in create_pg_connection
conn = retry_pg_connection_fn(engine.connect)
File "/usr/local/lib/python3.10/site-packages/dagster_postgres/utils.py", line 130, in retry_pg_connection_fn
raise DagsterPostgresException("too many retries for DB connection") from exc
dagster_postgres.utils.DagsterPostgresException: too many retries for DB connection
There we go. from the actual containerCasper Weiss Bang
04/04/2023, 3:58 PMport 5432 failed: FATAL: SSL connection is required. Please specify SSL options and retry.
SSL errors normally occure when we have some network issues. i.e firewall rules. so that might be the reason it broke? but i don't get why it wouldn't eventually get access again, and or simply mark it as terminatedCasper Weiss Bang
04/04/2023, 3:59 PMdaniel
04/04/2023, 3:59 PMCasper Weiss Bang
04/04/2023, 4:00 PMCasper Weiss Bang
04/04/2023, 4:04 PM2023-03-31T18:02:19.595391107Z 2023-03-31 18:02:18 +0000 - dagster - DEBUG - status_job - 86d088fc-d550-4560-9f2f-f9767f7851f1 - raw_dev__status__vis_data_quality - life_cycle_state='PENDING'
whereafter i got the first SSL error at 2023-03-31T18:02:24.595959509Z
- meaning it basically only retries for ~40 seconds before terminating. We normally retry for a few minutes. Also maybe the docker runner should check if the specific container has "died"daniel
04/04/2023, 4:27 PMMark Fickett
04/04/2023, 7:41 PM