Here is my code incase that i am following ```import csv fro dagster #announcements

Here is my code incase that i am following: ```imp...

Hamza Khurshid Butt

02/23/2021, 1:17 PM

Here is my code incase that i am following:

Copy code

import csv
from datetime import datetime, time

from dagster import daily_schedule, pipeline, repository, solid
from dagster.utils import file_relative_path


@solid
def hello_cereal(context, date):
    dataset_path = file_relative_path(__file__, "cereal.csv")
    <http://context.log.info|context.log.info>(dataset_path)
    with open(dataset_path, "r") as fd:
        cereals = [row for row in csv.DictReader(fd)]

    <http://context.log.info|context.log.info>(
        "Today is {date}. Found {n_cereals} cereals".format(
            date=date, n_cereals=len(cereals)
        )
    )


@daily_schedule(
    pipeline_name="hello_cereal_pipeline",
    start_date=datetime(2021, 2, 24),
    execution_time=time(6, 45)
)
def good_morning_schedule(date):
    return {
        "solids": {
            "hello_cereal": {
                "inputs": {"date": {"value": date.strftime("%Y-%m-%d")}}
            }
        }
    }


@pipeline
def hello_cereal_pipeline():
    hello_cereal()


@repository
def hello_cereal_repository():
    return [hello_cereal_pipeline, good_morning_schedule]

daniel

02/23/2021, 1:26 PM

Hi Hamza - it looks like your start date is set for tomorrow. Try setting it to something earlier and I bet your schedule will run. Note that the default behavior is for the schedule corresponding to a given day to execute on the next day (I.e. if you set your start date to 02/24, the first time it will run will be on 02/25, using 02/24 as the partition for your schedule. So to have it run tomorrow (02/24), set the start date to 02/23.

Hamza Khurshid Butt

02/23/2021, 1:51 PM

@daniel yup it worked now 👍🙂 but can you explain this behaviour more because date here is 2/23 and when i gave start date of 2/22 it worked but with date of 2/23 it does not work

daniel

02/23/2021, 2:31 PM

By default, the partition that is used for the run will be one partition earlier than the partition that includes the current time, to capture a common ETL use case - for example, a daily schedule will fill in the previous day's partition, and a monthly schedule will fill in last month's partition. So the schedule that happens on 2/23 fills in 2/22. (in the most recent release, we added the ability to customize this using the 'partition_days_offset' parameter on the schedule - you can set this to 0 if you want it to run the 2/23 partition on 2/23)

Hamza Khurshid Butt

02/24/2021, 7:09 AM

Perfect, Thanks for the explanation

3 Views

Open in Slack

Previous Next