Hi - I can't get the scheduler to work, and I don'...
# announcements
t
Hi - I can't get the scheduler to work, and I don't know why. I'm running Dagit inside a container (on a Windows host). The container has cron installed. The pipelines execute fine if I trigger them in the Dagit UI, but the schedule just doesn't activate. More details 👉
Output of `dagter schedule debug`:
Copy code
# dagster schedule debug
Scheduler Configuration
=======================
Scheduler:
     module: dagster_cron.cron_scheduler
     class: SystemCronScheduler
     config:
       {}


Scheduler Info
==============
Running Cron Jobs:
30 7 * * * /opt/dagster/dagster_home/schedules/scripts/c2d87ddff6867c6180a2bf164218586ffbb64c0b.sh > /opt/dagster/dagster_home/schedules/logs/c2d87ddff6867c6180a2bf164218586ffbb64c0b/scheduler.log 2>&1 # dagster-schedule: c2d87ddff6867c6180a2bf164218586ffbb64c0b
5 11 * * * /opt/dagster/dagster_home/schedules/scripts/2d84fba9c7a01dae736f3ca8ec65c4625b2d5608.sh > /opt/dagster/dagster_home/schedules/logs/2d84fba9c7a01dae736f3ca8ec65c4625b2d5608/scheduler.log 2>&1 # dagster-schedule: 2d84fba9c7a01dae736f3ca8ec65c4625b2d5608


Scheduler Storage Info
======================
my_first_pipeline:
  cron_schedule: 30 7 * * *
  pipeline_origin_id: c2d87ddff6867c6180a2bf164218586ffbb64c0b
  python_path: /usr/local/bin/python
  repository_origin_id: 8651c1dcac6632fd200ddfc60b20f7e7bee30fc6
  repository_pointer: -f /opt/dagster/app/pipelines.py -a analytics -d /opt/dagster/app
  schedule_origin_id: c2d87ddff6867c6180a2bf164218586ffbb64c0b
  status: RUNNING

my_second_pipeline:
  cron_schedule: 5 11 * * *
  pipeline_origin_id: 2d84fba9c7a01dae736f3ca8ec65c4625b2d5608
  python_path: /usr/local/bin/python
  repository_origin_id: 8651c1dcac6632fd200ddfc60b20f7e7bee30fc6
  repository_pointer: -f /opt/dagster/app/pipelines.py -a analytics -d /opt/dagster/app
  schedule_origin_id: 2d84fba9c7a01dae736f3ca8ec65c4625b2d5608
  status: RUNNING
s
Try running
dagster schedule logs schedule_name
t
Thanks for the quick reply! It points me to a file that appears to be empty:
Copy code
# dagster schedule logs my_second_schedule
/opt/dagster/dagster_home/schedules/logs/2d84fba9c7a01dae736f3ca8ec65c4625b2d5608/scheduler.log
# cat /opt/dagster/dagster_home/schedules/logs/2d84fba9c7a01dae736f3ca8ec65c4625b2d5608/scheduler.log
#
s
Hm if there were any errors in executing the schedule, they would show up there
Some things to check:
Make sure cron is actually running on your machine. You can quickly test that by adding a custom cron job that writes to a file every minute
If you see that works, try running that cron job command yourself and see if you get anything:
/opt/dagster/dagster_home/schedules/scripts/c2d87ddff6867c6180a2bf164218586ffbb64c0b.sh
t
ohhhh, I just figured it out. Not a cron issue
s
Nice! What was it
t
when viewing the schedule config in the dagit ui, I'm getting:
Copy code
Operation name: FetchScheduleYaml

Message: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "failed to connect to all addresses"
	debug_error_string = "{"created":"@1605387068.582981700","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":4165,"referenced_errors":[{"created":"@1605387068.582973300","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":397,"grpc_status":14}]}"
>
I'm trying to read from a yaml file containing secrets, so I set the yaml file's path as an ENV var, but the schedule needs that ENV var set separately, if I understand correctly
s
Yes, there’s an env vars argument on the schedule definition that you need to set
But you’re getting this error in Dagit? This looks like a gRPC issue that’s not related to env vars
t
yeah, if on the /schedules page I click the down arrow in the Execution Params field, I get the error in Dagit. I think it's because I'm setting
run_config
my reading in from yaml files:
Copy code
run_config = merge_yamls([
        file_relative_path(__file__, "config/urt_prod.yaml"),
        DBT_CLOUD_CONFIG_FILE,
        REDSHIFT_CONFIG_FILE
    ])
where those CONFIG_FILE paths are set elsewhere in the repo.py file as:
Copy code
DBT_CLOUD_CONFIG_FILE = os.environ['DBT_CLOUD_CONFIG_FILE']
REDSHIFT_CONFIG_FILE = os.environ['REDSHIFT_CONFIG_FILE']
which I'm doing because I'm using docker-compose to copy those config files in as secrets, and I thought it would be handy to store the paths as ENV vars at the same time
This works great for presets, but I guess not for schedules, since they execute in a different environment?
s
Hm when you’re previewing the schedule config in dagit it’s in the same environment as your presets
Can you show us your schedule definition?
Btw we have a new scheduler coming out in around 3 weeks that greatly improves both the operational experience and this interaction with env variables
t
Thanks again for your help, @sashank ok if I DM you?
s
Yup, I can make a DM thread 🙂