https://dagster.io/ logo
#ask-community
Title
# ask-community
b

Ben Kimpel

08/02/2023, 6:04 PM
hello! hoping someone can help out... our team has found scheduling partitioned assets somewhat confusing, but here's where we've ended up. We process financial data on M-F which is generally available on the same day several hours after markets close. In this case we run at 4:30pm NY time. We've got the weekday partition with the end_offset of 1 to let us run the partition during that day. It feels a bit like we're off the beaten path because we've had to make our own partition def for M-F and we can't seem to get
build_schedule_from_partitioned_asset_job
to run at a particular time. We don't want to shift our definition of a partition start/end time to match when a file drops.
Copy code
from dagster import asset, define_asset_job, schedule, TimeWindowPartitionsDefinition

weekday_partition = TimeWindowPartitionsDefinition(
    start="2022-01-01",
    fmt="%Y-%m-%d",
    timezone="America/New_York",
    cron_schedule="0 0 * * 1-5",
    end_offset=1,
)

@asset(partitions_def=weekday_partition)
def testing(context):
    return [1, 2, 3]

testing_job = define_asset_job("testing_job", partitions_def=weekday_partition, selection=[testing])

@schedule(
    "30 16 * * 1-5",
    execution_timezone="America/New_York",
    job=testing_job,
)
def testing_schedule(context):
    return testing_job.run_request_for_partition(
        partition_key=context.scheduled_execution_time.format("YYYY-MM-DD"),
        run_key=context.scheduled_execution_time.format("YYYY-MM-DD"),
        current_time=context.scheduled_execution_time,
    )
c

chris

08/02/2023, 6:51 PM
I’m having trouble interpreting what the actual question is here 😅 - looks like you got things working with the
@schedule
decorator - is there a problem with that approach for you?
b

Ben Kimpel

08/02/2023, 6:54 PM
there's a few things, yes. first, was just curious about best practices, because this feels incorrect. for example, we can't use
build_schedule_from_partitioned_asset_job
as noted. second, we have to repeat our partition definitions if we have to do it this because the start_date is different. we have about 30 or so assets with different start dates. we could make a helper, but it still feels like we're fighting the system here for some reason. third, we're unsure why we need to pass a partition to define_asset_job at all since the asset is already partitioned and all partitions must match for it to run anyway?
there are two cron definitions as well
i hope it's at least clear that it's a little bit counter intuitive.
if it's obvious and i'm missing something i was just hoping someone could point me in the right direction
(@chris Sorry for not having a clearly stated question in initial post, premature send. The question was supposed to be... are we on the right track? Is this how Elementl would recommend writing this schedule given these constraints?)
c

chris

08/02/2023, 10:14 PM
Okay for each of the points: 1. Why didn’t
build_schedule_from_partitioned_asset_job
work for you? You should, at least theoretically, be able to use the
hour_of_day
and
minute_of_hour
params in order to run the schedule at 4:30 every day. 2. In the case of having many different partitions definitions with different start dates, the best way is indeed to have a helper. I agree this is a bit clunky. 3. You also shouldn’t need to pass the partitions definition to
define_asset_job
- it’s an optional argument. Are you hitting an error if it’s not provided? Sorry you’re running into these rough edges
2 Views