Benoit Perigaud
10/02/2021, 9:53 PMdagster-daemon run
command is always eating 100% of one of my CPUs.
Here is the error in journalctl (I'm on dagster 0.12.12):
Oct 03 08:49:03 raspberrypi bash[10714]: 2021-10-03 08:49:03 - dagster-daemon - ERROR - Thread for SCHEDULER did not shut down gracefully
Oct 03 08:49:03 raspberrypi bash[10714]: Traceback (most recent call last):
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/bin/dagster-daemon", line 8, in <module>
Oct 03 08:49:03 raspberrypi bash[10714]: sys.exit(main())
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/lib/python3.7/site-packages/dagster/daemon/cli/__init__.py", line 135, in main
Oct 03 08:49:03 raspberrypi bash[10714]: cli(obj={}) # pylint:disable=E1123
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/lib/python3.7/site-packages/click/core.py", line 829, in __call__
Oct 03 08:49:03 raspberrypi bash[10714]: return self.main(*args, **kwargs)
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/lib/python3.7/site-packages/click/core.py", line 782, in main
Oct 03 08:49:03 raspberrypi bash[10714]: rv = self.invoke(ctx)
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
Oct 03 08:49:03 raspberrypi bash[10714]: return _process_result(sub_ctx.command.invoke(sub_ctx))
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
Oct 03 08:49:03 raspberrypi bash[10714]: return ctx.invoke(self.callback, **ctx.params)
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/lib/python3.7/site-packages/click/core.py", line 610, in invoke
Oct 03 08:49:03 raspberrypi bash[10714]: return callback(*args, **kwargs)
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/lib/python3.7/site-packages/dagster/daemon/cli/__init__.py", line 48, in run_command
Oct 03 08:49:03 raspberrypi bash[10714]: controller.check_daemon_loop()
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/lib/python3.7/site-packages/dagster/daemon/controller.py", line 237, in check_daemon_loop
Oct 03 08:49:03 raspberrypi bash[10714]: self.check_daemon_heartbeats()
Oct 03 08:49:03 raspberrypi bash[10714]: File "/home/pi/.envs/dagster/lib/python3.7/site-packages/dagster/daemon/controller.py", line 212, in check_daemon_heartbeats
Oct 03 08:49:03 raspberrypi bash[10714]: failed_daemons=failed_daemons
Oct 03 08:49:03 raspberrypi bash[10714]: Exception: Stopping dagster-daemon process since the following threads are no longer sending heartbeats: ['SCHEDULER']
Oct 03 08:49:04 raspberrypi systemd[1]: dagster-daemon.service: Main process exited, code=exited, status=1/FAILURE
Oct 03 08:49:04 raspberrypi systemd[1]: dagster-daemon.service: Failed with result 'exit-code'.
Oct 03 08:49:04 raspberrypi systemd[1]: dagster-daemon.service: Service RestartSec=100ms expired, scheduling restart.
Oct 03 08:49:04 raspberrypi systemd[1]: dagster-daemon.service: Scheduled restart job, restart counter is at 6.
Oct 03 08:49:04 raspberrypi systemd[1]: Stopped Daemon for dagster.
Oct 03 08:49:04 raspberrypi systemd[1]: Started Daemon for dagster.
The heath page tells me: "Not running - No recent heartbeat"daniel
10/02/2021, 10:11 PMBenoit Perigaud
10/02/2021, 10:14 PMexecution_timezone="Australia/Sydney"
And I have multiple pipelines on a cron schedule (including one that runs every 15 minutes: cron_schedule="*/15 * * * *"
daniel
10/02/2021, 10:18 PMBenoit Perigaud
10/02/2021, 10:28 PMdaniel
10/02/2021, 10:51 PMBenoit Perigaud
10/02/2021, 10:54 PMshould_execute
filter as well):
sched_pipeline1 = ScheduleDefinition(cron_schedule="*/15 * * * *", execution_timezone="Australia/Sydney", should_execute=hour_filter, job=pipeline1)
sched_pipeline2 = ScheduleDefinition(cron_schedule="0 8,20 * * *", execution_timezone="Australia/Sydney", job=pipeline2)
sched_pipeline3 = ScheduleDefinition(cron_schedule="0 8 * * *", execution_timezone="Australia/Sydney", job=pipeline3)
daniel
10/03/2021, 12:14 AMBenoit Perigaud
10/03/2021, 12:17 AMdef hour_filter(_context):
hour = datetime.now().hour
return hour >= 6 or hour == 0
should_execute
parameter but it is still failing without it.daniel
10/03/2021, 12:39 AMBenoit Perigaud
10/03/2021, 12:43 AMname
to my schedule and make it a different one from the original name. Heartbeats are working and CPU is down to normal levels even with the 15 min schedule On.
I am losing the info about the previous schedules though now.
If I can help with further troubleshooting, let me know. I guess that if I change my schedule name back to the original one the error will reappear.daniel
10/03/2021, 12:46 AMBenoit Perigaud
10/03/2021, 12:48 AMdaniel
10/03/2021, 9:17 PMpaul.q
10/03/2021, 10:18 PMBenoit Perigaud
10/03/2021, 10:21 PMpaul.q
10/03/2021, 10:33 PMdaniel
10/03/2021, 11:11 PMpaul.q
10/04/2021, 3:23 AMdaniel
10/04/2021, 9:52 PMChecking for new runs for the following schedules:
- that and the following lines would give a lot of insight into where exactly it's going wrong (for example it might say No run requests returned for golf_pipeline_schedule, skipping
) - that would be instructive
• Could you share what version of pendulum you have installed as part of your dagster install?2021-10-03 01:34:00 - SchedulerDaemon - INFO - Evaluating schedule `schedule_sydney` at 2021-10-03 01:30:00+1000
2021-10-03 01:34:00 - SchedulerDaemon - INFO - Completed scheduled launch of run 6afeac99-a325-4292-9c5b-7e62cac17cda for schedule_sydney
<30 minutes pass, crossing the DST transition>
2021-10-03 03:04:00 - SchedulerDaemon - INFO - Evaluating schedule `partitionless_schedule_sydney` at 2021-10-03 03:00:00+1100
2021-10-03 03:04:00 - SchedulerDaemon - INFO - Completed scheduled launch of run cb59ae9a-8f25-4e84-8095-d2cca2cc4750 for partitionless_schedule_sydney
But clearly that did not happen in your environment for some reason, very strange.295594 0e55b0dc50d412446db9b66354a83a979b494b2d SUCCESS SCHEDULE 2021-10-03 15:00:00 {"__class__": "JobTickData", "cursor": null, "error": null, "job_name": "daily_upload_schedule", "job_origin_id": "0e55b0dc50d412446db9b66354a83a979b494b2d", "job_type": {"__enum__": "JobType.SCHEDULE"}, "origin_run_ids": [], "run_ids": ["12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0", "12c36386-04c2-4aa0-a001-8b3bc22ccec0"], "run_keys": ["2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30", "2021-09-30"], "skip_reason": null, "status": {"__enum__": "JobTickStatus.SUCCESS"}, "timestamp": 1633273200.0} 2021-10-03 15:00:02.324713 2021-10-03 15:00:02.324713
Benoit Perigaud
10/04/2021, 10:57 PMdaniel
10/05/2021, 2:59 AMBenoit Perigaud
10/05/2021, 6:49 AMdaniel
10/05/2021, 12:26 PMBenoit Perigaud
10/05/2021, 9:06 PMpaul.q
10/05/2021, 9:40 PMBenoit Perigaud
10/08/2021, 10:21 PMdaniel
10/08/2021, 10:21 PM