Hey, we upgraded to `dagster 0.9.1` (and we will u...
# announcements
t
Hey, we upgraded to
dagster 0.9.1
(and we will update very soon to
0.9.2
). But we encounter an issue with the k8s deployment of dagit at startup with the Scheduler. We use the
SystemCronScheduler
(so far but will migrate to the
K8sCronJob
soon) and here is our custom dagit startup command (injected in helm):
Copy code
command: [
            "/bin/bash",
            "-c",
            {{- if .Values.userDeployments.enabled }}
            "{{ template "dagster.dagit.scheduleUpCommand" $ }}"
            {{- else }}
            "service cron start; \
            /usr/local/bin/dagster schedule up --location 'companies_repository'; \
            /usr/local/bin/dagster schedule start --start-all --location 'companies_repository'; \
            dagit -h 0.0.0.0 -p 80"
            {{- end }}
          ]
And here is the logs from the container on k8s replicaset:
Copy code
Starting periodic command scheduler: cron.
Errors Resolved:
Schedule company_update_pipeline_schedule is set to be running, but the scheduler is not running the schedule.
Schedule company_init_pipeline_schedule is set to be running, but the scheduler is not running the schedule.
Traceback (most recent call last):
  File "/usr/local/bin/dagster", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/dagster/cli/__init__.py", line 38, in main
    cli(obj={})  # pylint:disable=E1123
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/dagster/cli/schedule.py", line 255, in schedule_start_command
    return execute_start_command(schedule_name, start_all, kwargs, click.echo)
  File "/usr/local/lib/python3.7/site-packages/dagster/cli/schedule.py", line 268, in execute_start_command
    instance.start_schedule_and_update_storage_state(external_schedule)
  File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 983, in start_schedule_and_update_storage_state
    return self._scheduler.start_schedule_and_update_storage_state(self, external_schedule)
  File "/usr/local/lib/python3.7/site-packages/dagster/core/scheduler/scheduler.py", line 271, in start_schedule_and_update_storage_state
    name=external_schedule.name
dagster.core.scheduler.scheduler.DagsterSchedulerError: You have attempted to start schedule company_init_pipeline_schedule, but it is already running
Loading repository...
Serving on <http://0.0.0.0:80> in process 107

  Telemetry:

  As an open source project, we collect usage statistics to inform development priorities. For more
  information, read <https://docs.dagster.io/install/telemetry>.

  We will not see or store solid definitions, pipeline definitions, modes, resources, context, or
  any data that is processed within solids and pipelines.

  To opt-out, add the following to $DAGSTER_HOME/dagster.yaml, creating that file if necessary:

    telemetry:
      enabled: false


  Welcome to Dagster!

  If you have any questions or would like to engage with the Dagster team, please join us on Slack
  (<https://bit.ly/39dvSsF>).
The problem here is that it trigger a crashloop of the container and dagit is not really up on k8s. If someone has an idea on this. I think I will remove our custom commands to schedule up and start but do I have to run these commands manually?
Copy code
/usr/local/bin/dagster schedule up --location 'companies_repository'; \
/usr/local/bin/dagster schedule start --start-all --location 'companies_repository'; \
Thanks in advance for your help, if you need any info or debug logs, let me know.
Added some schedule debug info:
Copy code
# dagster schedule debug
Scheduler Configuration
=======================
Scheduler:
     module: dagster_cron.cron_scheduler
     class: SystemCronScheduler
     config:
       {}


Scheduler Info
==============
Running Cron Jobs:
0 1 * * * /opt/dagster/dagster_home/schedules/scripts/ebd2d09d4e4f0b995dfdf904538d81dbadd910b9.sh > /opt/dagster/dagster_home/schedules/logs/ebd2d09d4e4f0b995dfdf904538d81dbadd910b9/scheduler.log 2>&1 # dagster-schedule: ebd2d09d4e4f0b995dfdf904538d81dbadd910b9
0 9 * * * /opt/dagster/dagster_home/schedules/scripts/691fc74250cbff690d83461f61335db90f1ed7f7.sh > /opt/dagster/dagster_home/schedules/logs/691fc74250cbff690d83461f61335db90f1ed7f7/scheduler.log 2>&1 # dagster-schedule: 691fc74250cbff690d83461f61335db90f1ed7f7


Scheduler Storage Info
======================
company_init_pipeline_schedule:
  cron_schedule: 0 1 * * *
  pipeline_origin_id: ebd2d09d4e4f0b995dfdf904538d81dbadd910b9
  python_path: /usr/local/bin/python
  repository_origin_id: 22c214dc9a43303ac1e73076f59f8a054b929929
  repository_pointer: -m cnty_pipeline.repository -a companies_repository
  schedule_origin_id: ebd2d09d4e4f0b995dfdf904538d81dbadd910b9
  status: RUNNING

company_update_pipeline_schedule:
  cron_schedule: 0 9 * * *
  pipeline_origin_id: 691fc74250cbff690d83461f61335db90f1ed7f7
  python_path: /usr/local/bin/python
  repository_origin_id: 22c214dc9a43303ac1e73076f59f8a054b929929
  repository_pointer: -m cnty_pipeline.repository -a companies_repository
  schedule_origin_id: 691fc74250cbff690d83461f61335db90f1ed7f7
  status: RUNNING
Seems I'm missing something about how I load scheduled pipelines
Here my
workspace.yaml
(has not changed since dagster 0.8)
Copy code
load_from:
  - python_package:
      package_name: cnty_pipeline.repository
      attribute: example_repository
  - python_package:
      package_name: cnty_pipeline.repository
      attribute: activities_repository
  - python_package:
      package_name: cnty_pipeline.repository
      attribute: companies_repository
j
@sashank
c
hmm, so we throw the error that you’re seeing when a user tries to start a schedule that is already running so removing
schedule start
command should fix it
i think this could be caused by the schedule state being set to running in the db, dagit being terminated without the schedule state being updated to stopped in the db, and then redeploying dagit
which would then think the schedule is still running, and also be unable to start the schedule
i think using local file system for storage should resolve this, @sashank feel free to chime in here
s
The state that stores whether a schedule should be running or not is persistent across deploys. If your schedule was already set to be previously running,
dagster schedule up
will make sure to put schedules set to be running back on the
Scheduler
. If you look at your debug output, you can first see the output of
dagster schedule up
which is putting running schedules back on the
Scheduler
. But then you’re running
dagster schedule start --start-all
which is causing an error, because the schedules are already started.
All you need to do is take out the
dagster schedule start --start-all
🙏 1
t
👌 thanks a lot, very clear!
Here we go 👌
🎉 1