https://dagster.io/ logo
#ask-community
Title
# ask-community
g

Giovanni Paolo

07/14/2023, 5:42 PM
re this: https://dagster.slack.com/archives/C01U954MEER/p1686073627696329 context: source data for the partition is only ready 3 days after the partition is "closed". I shifted the schedule to materialize it 3 days later, but during this time the partition shows up as missing in the UI (even though materializing it will result in an error) now I have an unpartitioned asset that depends on it, but using LastPartitionMapping points to the missing partition for the first 3 days until it is materialized.
i tried something with it with an intermediate asset that eagerly automaterializes from this asset. then I can depend on this intermediate my unpartitioned asset but this feels way too clunky
here's what it currently looks like
does anyone have a better idea? e.g. • a way to express that although the weekly partition includes that 7-day range, data is only ready on last_day+3 • something akin to LastMaterializedPartitionMapping
o

owen

07/14/2023, 6:17 PM
Hm this is definitely an interesting use case, and it does potentially point to a missing concept in the dagster framework, where currently we assume that any partition is ready to be materialized as soon as it pops into existence. Another workaround (which isn't necessarily any better) would be to just set up the weekly_asset to be a
Copy code
TimeWindowPartitionsDefinition(
    start_date=...,
    schedule_type=ScheduleType.WEEKLY,
    day_offset=3,
    fmt="%Y-%m-%d",
)
This would let you have your weekly partitions be shifted by 3 days, so they'd only pop into existence when they actually existed. The downside of this is that your partitions will be mislabeled (as they'll represent the time they came into existence rather than the data they actually contain)
g

Giovanni Paolo

07/14/2023, 6:44 PM
hmmm I don't think this works as I expected. I just tried it, shifted my date back to july 12th, and the 2023-07-05 partition is still there, which would include this range:
TimeWindow(start=DateTime(2023, 7, 5, 0, 0, 0, tzinfo=Timezone('UTC')), end=DateTime(2023, 7, 12, 0, 0, 0, tzinfo=Timezone('UTC')))
but on the 12th i only have data up to the 9th or 10th
I feel like another option following this line of thought is to have each weekly asset query the range (start_date-3, end_date-3). again, having some discrepancies between partition key and actual time window
but doesn't feel great either 😞