Hi all, I am not sure what went wrong but suddenly all of my scheduled dagster jobs have started running twice on its scheduled time. I haven’t made any changes to the job code, schedule and/or queue coordinator. Did anyone ever see that happen? This is so strange, so trying to get a support here. Thanks!
Hi pragna, if you can share your daemon logs during a time period when two runs were launched we could take a look.. are you possibly running two daemons pointed at the same database?
Hi Daniel,
[32m2023-02-07 203300 +0000[0m - dagster.daemon.SchedulerDaemon - [34mERROR[0m - [31mAnother SCHEDULER daemon is still sending heartbeats. You likely have multiple daemon processes running at once, which is not supported. Last heartbeat daemon id: 835ff957-f01e-41a8-bdd9-21de44c07f13, Current daemon_id: 047fb0cb-3b48-4de7-b542-5297202e72c5[0m
I see this like you said. However, when I check there is only 1 task available and running. Can you help me with how to track down that extra running container?
How are you checking that there's only 1 task running?
From my ecs service console
Hi Daniel, I just checked at the scheduled time, two task instances are spinning up. Do you have any clue why that would happen and how can I stop that?
I don’t have quite enough information about your setup to know where the other daemon task would be coming from - but dagster only supports running a single daemon task at a time
I'm gonna add that I am also having this issue. I am running on Dagster 1.1.15. Started getting the same error message Pragna posted above. This began once I added a default configured executor to our repository. Let me know if you need additional information and I'll be glad to provide it.
Adam how do you have your daemon deployed?
I have it deployed via the terminal. I've ensured that I've closed all terminal instances and that Dagit shows the daemon is down before relaunching a single instance of the daemon. I still get the error.
Does the error appear more than once in the logs?
I could imagine it popping up once on startup if you stop a daemon and then start a new one right away
Yes, it seems to appear every 30 seconds for each daemon type (sensor, queued_run_coordinator, backfill, and scheduler)
Try running this in your terminal: "ps aux | grep dagster-daemon" - that would show if there was a background process still running
I do show two daemon PIDs. One of them shows it was started on Feb 10th despite killing all terminal instances. I can, of course, kill that oldest one. Any idea why I only began getting these errors once I implemented the configured executor though?
I can't think of a connection there - very likely not related to the executor specifically
Really strange. I'll kill it and will report back here if a "ghost" daemon appears again. Thanks, Daniel
