Hey guys, good morning! I wanted to ask for help with Dagster's scheduler because I am really confused. Say I have an example like the one in the getting started guide:
from dagster import job, op

def get_name():
    return "dagster"

def hello(name: str):
    print(f"Hello, {name}!")

def hello_dagster():
If I want to run hello_dagster() every minute, what should I do? I've tried:
basic_schedule = ScheduleDefinition(job=hello_dagster, cron_schedule="* * * * *")
and then created a repo and ran dagster-daemon. Would love to learn how to do this properly. Thank you!
yep, that looks and sounds right to me! the schedule is a separate entity from the job, so you have to build it separately.
So.. Do I need to run dagster and dagster-daemon or how does it work?
ah yes. if you just run the dagit webserver -- e.g.
dagit -f my_file.py
-- you'll see warning flags on the "schedules" and "sensors" part of the UI. so you have to also run the
. It's kind of a pain because you also have to set up a
file for local development. I would recommend getting this `docker compose` example running locally, then editing the
file to play around with the scheduling stuff.
but basically that dagster-daemon is going to run the process that checks whether anything needs to be scheduled every X seconds or so.
I see... Thank you, Stephen! So dagster-daemon should be always running in-parallel to dagster UI (dagit) or the dagster CLI?
yes, in your main deployment you'll need a postgres DB, dagit, the daemon, and your user code deployment. you can see these services in the compose file here.
what is postgres being used for? Storing the runs and logs or something?
Interesting. Thank you for explaining. What if I have a file that runs a real-time process, such as the data feed and then I need dagster to run another process on schedule. How should I go about that?
it sounds like your real-time data process might be a
, which is also run by the daemon. the way i think of it -- might not be 100% right -- is this: 1. write code 2. dagit reads code and parses objects from repos -- jobs, ops, schedules, sensors 3. daemon takes any parsed schedules, sensors and starts evaluating them every N seconds
