Titouan
08/14/2020, 9:06 AMdagster 0.9.1
(and we will update very soon to 0.9.2
). But we encounter an issue with the k8s deployment of dagit at startup with the Scheduler. We use the SystemCronScheduler
(so far but will migrate to the K8sCronJob
soon) and here is our custom dagit startup command (injected in helm):
command: [
"/bin/bash",
"-c",
{{- if .Values.userDeployments.enabled }}
"{{ template "dagster.dagit.scheduleUpCommand" $ }}"
{{- else }}
"service cron start; \
/usr/local/bin/dagster schedule up --location 'companies_repository'; \
/usr/local/bin/dagster schedule start --start-all --location 'companies_repository'; \
dagit -h 0.0.0.0 -p 80"
{{- end }}
]
And here is the logs from the container on k8s replicaset:
Starting periodic command scheduler: cron.
Errors Resolved:
Schedule company_update_pipeline_schedule is set to be running, but the scheduler is not running the schedule.
Schedule company_init_pipeline_schedule is set to be running, but the scheduler is not running the schedule.
Traceback (most recent call last):
File "/usr/local/bin/dagster", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/dagster/cli/__init__.py", line 38, in main
cli(obj={}) # pylint:disable=E1123
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/dagster/cli/schedule.py", line 255, in schedule_start_command
return execute_start_command(schedule_name, start_all, kwargs, click.echo)
File "/usr/local/lib/python3.7/site-packages/dagster/cli/schedule.py", line 268, in execute_start_command
instance.start_schedule_and_update_storage_state(external_schedule)
File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 983, in start_schedule_and_update_storage_state
return self._scheduler.start_schedule_and_update_storage_state(self, external_schedule)
File "/usr/local/lib/python3.7/site-packages/dagster/core/scheduler/scheduler.py", line 271, in start_schedule_and_update_storage_state
name=external_schedule.name
dagster.core.scheduler.scheduler.DagsterSchedulerError: You have attempted to start schedule company_init_pipeline_schedule, but it is already running
Loading repository...
Serving on <http://0.0.0.0:80> in process 107
Telemetry:
As an open source project, we collect usage statistics to inform development priorities. For more
information, read <https://docs.dagster.io/install/telemetry>.
We will not see or store solid definitions, pipeline definitions, modes, resources, context, or
any data that is processed within solids and pipelines.
To opt-out, add the following to $DAGSTER_HOME/dagster.yaml, creating that file if necessary:
telemetry:
enabled: false
Welcome to Dagster!
If you have any questions or would like to engage with the Dagster team, please join us on Slack
(<https://bit.ly/39dvSsF>).
The problem here is that it trigger a crashloop of the container and dagit is not really up on k8s.
If someone has an idea on this. I think I will remove our custom commands to schedule up and start but do I have to run these commands manually?
/usr/local/bin/dagster schedule up --location 'companies_repository'; \
/usr/local/bin/dagster schedule start --start-all --location 'companies_repository'; \
Thanks in advance for your help, if you need any info or debug logs, let me know.# dagster schedule debug
Scheduler Configuration
=======================
Scheduler:
module: dagster_cron.cron_scheduler
class: SystemCronScheduler
config:
{}
Scheduler Info
==============
Running Cron Jobs:
0 1 * * * /opt/dagster/dagster_home/schedules/scripts/ebd2d09d4e4f0b995dfdf904538d81dbadd910b9.sh > /opt/dagster/dagster_home/schedules/logs/ebd2d09d4e4f0b995dfdf904538d81dbadd910b9/scheduler.log 2>&1 # dagster-schedule: ebd2d09d4e4f0b995dfdf904538d81dbadd910b9
0 9 * * * /opt/dagster/dagster_home/schedules/scripts/691fc74250cbff690d83461f61335db90f1ed7f7.sh > /opt/dagster/dagster_home/schedules/logs/691fc74250cbff690d83461f61335db90f1ed7f7/scheduler.log 2>&1 # dagster-schedule: 691fc74250cbff690d83461f61335db90f1ed7f7
Scheduler Storage Info
======================
company_init_pipeline_schedule:
cron_schedule: 0 1 * * *
pipeline_origin_id: ebd2d09d4e4f0b995dfdf904538d81dbadd910b9
python_path: /usr/local/bin/python
repository_origin_id: 22c214dc9a43303ac1e73076f59f8a054b929929
repository_pointer: -m cnty_pipeline.repository -a companies_repository
schedule_origin_id: ebd2d09d4e4f0b995dfdf904538d81dbadd910b9
status: RUNNING
company_update_pipeline_schedule:
cron_schedule: 0 9 * * *
pipeline_origin_id: 691fc74250cbff690d83461f61335db90f1ed7f7
python_path: /usr/local/bin/python
repository_origin_id: 22c214dc9a43303ac1e73076f59f8a054b929929
repository_pointer: -m cnty_pipeline.repository -a companies_repository
schedule_origin_id: 691fc74250cbff690d83461f61335db90f1ed7f7
status: RUNNING
workspace.yaml
(has not changed since dagster 0.8)
load_from:
- python_package:
package_name: cnty_pipeline.repository
attribute: example_repository
- python_package:
package_name: cnty_pipeline.repository
attribute: activities_repository
- python_package:
package_name: cnty_pipeline.repository
attribute: companies_repository
johann
08/14/2020, 1:57 PMcat
08/14/2020, 3:07 PMschedule start
command should fix itsashank
08/14/2020, 4:07 PMdagster schedule up
will make sure to put schedules set to be running back on the Scheduler
.
If you look at your debug output, you can first see the output of dagster schedule up
which is putting running schedules back on the Scheduler
.
But then you’re running dagster schedule start --start-all
which is causing an error, because the schedules are already started.dagster schedule start --start-all
Titouan
08/17/2020, 7:02 AM