https://dagster.io/ logo
#ask-community
Title
# ask-community
n

Nicholas Pezolano

07/10/2023, 4:16 PM
Is there a way to get the previous scheduled partition key in an asset? If i use the below in a scheduled asset it will return a partition key even if the should_execute in the schedule is False:
Copy code
@asset(partitions_def=partition_def)
def my_asset(context):
    current_partition_window = context.partition_time_window
    prev_window = daily_partitions_def.get_prev_partition_window(current_partition_window.start)
    prev_partition_key = prev_window.start
    return prev_partition_key

my_job = define_asset_job("my_job", selection=[my_asset], partitions_def=partitions_def)
e.g. if my partition is:
Copy code
partitions_def = DailyPartitionsDefinition(start_date='2023-07-03', timezone='America/New_York', fmt='%Y-%m-%d', end_offset=1)
and my scheduled job is:
Copy code
def skip_july_4th(context) -> bool:
    dt = context.scheduled_execution_time
    if '2023-07-04' in str(dt):
        return False
    else:
        return True

@schedule(job=my_job, should_execute=skip_july_4th, cron_schedule="40 22 * * 1-5", execution_timezone='America/New_York')
def nyse_schedule(context):
    partition_key = daily_partitions_def.get_partition_key_for_timestamp(
        context.scheduled_execution_time.timestamp()
    )
    request = us_job.run_request_for_partition(partition_key=partition_key)
    yield request
The example above will return a previous partition key of '2023-07-04' on '2023-07-05' even thou it will return False for the given
should_execute
function in the schedule. How can I get the previous scheduled partition key in the asset?
c

chris

07/10/2023, 4:56 PM
This is an interesting case. At first, I was going to suggest using
get_latest_materialization_event
, but that doesn’t necessarily work if, for example, you try to backfill. I think you need to retrieve the asset materialization event with the highest partition, and I’m not sure if there’s an easy way to do that. Surfacing this to a discussion
@Dagster Bot discussion retrieving the latest time-based partition that was actually materialized for an asset
D 1
d

Dagster Bot

07/10/2023, 4:57 PM
Question in the thread has been surfaced to GitHub Discussions for future discoverability: https://github.com/dagster-io/dagster/discussions/15194
n

Nicholas Pezolano

07/10/2023, 5:04 PM
I think the git discussion\ got edited to something different from what I wanted to do
c

chris

07/10/2023, 5:04 PM
that was me editing it; I tried to make it more generic / useful to others, but I think it basically boils down to what you are looking for
D 1
n

Nicholas Pezolano

07/10/2023, 5:05 PM
ah ok
Are custom partitions still unsupported? I know they were in the past but that could be a solution
In my case it's not cron I want to follow for the partition but a pre-defined calendar https://pandas-market-calendars.readthedocs.io/en/latest/index.html#quick-start
c

chris

07/10/2023, 5:25 PM
Custom partitions are supported by static and dynamic partitions definitions; also this question I think is irrespective of cron / pre-defined calendar imo. I think it boils down to: based on some sorting of all of my partitions, give me the highest sorted partition that has been actually materialized.
does that jive with what you’re thinking?
n

Nicholas Pezolano

07/10/2023, 5:26 PM
Yeah I think they're both solutions to my core problem at least, I wouldn't need the above essentially if I could define a partition that skipped holidays for example.
c

chris

07/10/2023, 5:46 PM
ah I see what you mean. I think that when scheduling it might get weird if you had a partitioning that skipped holidays, because when your scheduler still fires, the system might get confused why there isn’t a partition for the current day (if it is a holiday for example). I think that the strategy you’re using of
should_execute
might be more robust in that regard
8 Views