https://dagster.io/ logo
i

Iman Encarnacion

10/31/2020, 3:17 PM
Hi! Does anyone know how to properly fix this problem without deleting existing runs from the dagster db?
Copy code
Note: You can turn off any of following running schedules, but you cannot turn them back on.
d

daniel

10/31/2020, 4:05 PM
hi iman - you shouldn't need to delete any runs to deal with this. What you need to do is press the reconcile button in the UI. which should remove those out-of-date schedules in the UI (usually because they were set up before your code updated and no longer point to the right place) and give you new correct schedules to turn on. (We're working on making this part of the system clearer)
i

Iman Encarnacion

10/31/2020, 4:11 PM
hi @daniel, we’re not seeing a reconcile button:
we also did a
schedule up
before this — and the odd thing is, we can actually start those pipelines through the CLI — and it works. could this be a bug?
d

daniel

10/31/2020, 4:14 PM
What command did you run to start dagit?
And are you confident that those schedules are still in the repository that was loaded by dagit?
and lastly just to double-check - the problem is still there if you stop and restart dagit?
i

Iman Encarnacion

10/31/2020, 4:17 PM
@daniel we used systemd to start dagit:
And are you confident that those schedules are still in the repository that was loaded by dagit?
yep, no changes on the code aside from the schedule
and lastly just to double-check - the problem is still there if you stop and restart dagit?
yes — in fact the problem is there even after deleting the
runs
,
schedules
,
history
, and
storage
folders.
d

daniel

10/31/2020, 4:19 PM
"no changes on the code aside from the schedule"" What changes did you make to the schedule
i

Iman Encarnacion

10/31/2020, 4:20 PM
@daniel we changed the schedule from an invalid schedule:
* 0/6 * * *
to
* * * * *
d

daniel

10/31/2020, 4:20 PM
Could you try running "dagster schedule wipe", restart dagit, and see if it still isn't picking up any schedules?
i

Iman Encarnacion

10/31/2020, 4:21 PM
@daniel yes we also tried
wipe
— to clarify, it’s picking up the schedules, and it runs according to schedule, but the UI is giving this warning
d

daniel

10/31/2020, 4:22 PM
Hmmm, that's very strange
before you changed the cron string, it was showing up correctly in the UI?
i

Iman Encarnacion

10/31/2020, 4:24 PM
we’re not so sure about that since it’s been a while — but to summarize so far: • UI is giving the unloadable warning • We can start/stop the schedules from CLI, but not from web (because of warning) • The schedule is working correctly • Wipe / Restarts / Deleting all state folders and databases doesn’t remove the UI error
d

daniel

10/31/2020, 4:27 PM
Hmmm, that last bullet is the most surprising part. Would you mind pasting the contents of your dagster.yaml?
i

Iman Encarnacion

10/31/2020, 4:28 PM
@daniel here it is:
Copy code
scheduler:
  module: dagster_cron.cron_scheduler
  class: SystemCronScheduler

telemetry:
  enabled: false
d

daniel

10/31/2020, 4:30 PM
hmmm I am really struggling to come up with a reason that error would be there after wiping and then restarting dagit. Just to triple-check, you're sure that you did that in that order? wipe and then restart?
the error should only appear if there are rows in the schedules database (that gets wiped out by the wipe command)
i

Iman Encarnacion

10/31/2020, 4:39 PM
@daniel yes, that’s correct — we did wipe multiple times, and start/stop also. In fact, we already deleted the
schedules
folder at some point, which I think also deletes the schedules db entirely
d

daniel

10/31/2020, 4:41 PM
got it. if your schedules folder is deleted, you're sure that's the schedules folder that your dagit is using, and restarting dagit still shows the error, I'll need to check with some other folks on the team
when you say 'and start/stop also.' - just confirming we're talking about the same thing, this is referring to shutting down the dagit process and then restarting it right? you mentioned systemd so I wasn't sure - for these changes I'm talking about to take effect we would need to completely stop dagit and then restart the process via the command line (or systemd)
i

Iman Encarnacion

10/31/2020, 4:46 PM
@daniel start/stop is dagit indeed. for dagit systemd, we did the usual systemctl stop and start — does that work?
d

daniel

10/31/2020, 4:47 PM
I haven't used systemctl much myself, but I would certainly think so
you could sanity check by making sure that the dagit page doesn't load when it's stopped
i'm just questioning all assumptions here since the wipe really should have worked
i

Iman Encarnacion

10/31/2020, 4:53 PM
@daniel we can do one final test, want to clarify that executing these should work: • dagster schedule wipe • dagit restart
d

daniel

10/31/2020, 4:56 PM
Yeah, if the error is still there after doing those two steps and nothing else, that’s surprising and very useful to know
Sorry for all the trouble here, I’m sure this will result in some good improvements to the system once we diagnose the problem
i

Iman Encarnacion

10/31/2020, 4:58 PM
@daniel no worries! thanks for all the help. if ever the error still persists, how can we help you replicate it? maybe just zip the dagster home and send over?
d

daniel

10/31/2020, 4:58 PM
That would be great, yeah
i

Iman Encarnacion

10/31/2020, 5:00 PM
@daniel will do, thanks!!
3 Views