Fernando Cagua
02/28/2024, 2:24 AMScout
02/28/2024, 2:24 AMHourlyPartitionsDefinition
for the downstream asset and set up the partition dependencies to reflect that each 6-hourly partition depends on the corresponding daily partition of the upstream asset. Here's a step-by-step approach based on the documentation: 1. Define the HourlyPartitionsDefinition
for the downstream asset, specifying the start date and time from which the hourly partitions should begin. 2. Define the DailyPartitionsDefinition
for the upstream asset, specifying the start date from which the daily partitions should begin. 3. Set up the partition dependencies. According to the documentation, when both the upstream and downstream assets are time window-partitioned, each partition in the downstream asset will depend on all partitions in the upstream asset that intersect its time window. In this case, since the downstream asset is partitioned hourly and depends on a daily partitioned upstream asset, you would expect that each set of 6 hourly partitions (covering a full day) would depend on the corresponding daily partition of the upstream asset. Here's an example of how you might define these assets in code: python from datetime import datetime from dagster import asset, HourlyPartitionsDefinition, DailyPartitionsDefinition start_date = datetime(2022, 1, 1) upstream_partitions = DailyPartitionsDefinition(start_date=start_date.strftime("%Y-%m-%d")) downstream_partitions = HourlyPartitionsDefinition(start_date=start_date.strftime("%Y-%m-%d-%H:%M")) @asset(partitions_def=upstream_partitions) def upstream_asset(): # Logic for upstream asset ... @asset(partitions_def=downstream_partitions) def downstream_asset(upstream_asset): # Logic for downstream asset that depends on the upstream asset ...
In this example, the downstream_asset
function takes upstream_asset
as an input, indicating a dependency. The partitioning logic that determines how the hourly partitions of downstream_asset
depend on the daily partitions of upstream_asset
is managed by Dagster's default partition dependency rules, as described in the documentation. If you need to customize the dependency behavior beyond the default rules, you can use a PartitionMapping
to define a custom mapping between partitions of the upstream and downstream assets. However, based on the documentation, the default behavior should handle the described scenario where each day's worth of hourly partitions depends on the corresponding daily partition of the upstream asset.Fernando Cagua
02/28/2024, 3:32 AMgrib_wind_partitions = dagster.TimeWindowPartitionsDefinition(
start=constants.STARBOARD_DATA_START_DATE,
fmt="%Y%m%d%H%M%S",
cron_schedule="0 */6 * * *", # Six hourly partitions
end_offset=int(15 * 24 / 6), # 15 days
)
Now I’m getting the error ValueError: time data '2024-03-13 12:00:00' does not match format '%Y%m%d%H%M%S'
. How can I solve it?Scout
02/28/2024, 3:32 AMTimeWindowPartitionsDefinition
and the actual datetime string that's being parsed. The format "%Y%m%d%H%M%S"
expects a datetime string without any separators, but the error message shows a datetime string with separators ('2024-03-13 12:00:00'
). To resolve this issue, you need to ensure that the datetime format string you provide to the fmt
parameter of TimeWindowPartitionsDefinition
matches the format of the datetime strings you are working with. If your datetime strings are in the format '2024-03-13 12:00:00'
, you should change the fmt
parameter to match this format. Here's how you can adjust the fmt
parameter: python grib_wind_partitions = dagster.TimeWindowPartitionsDefinition( start=constants.STARBOARD_DATA_START_DATE, fmt="%Y-%m-%d %H:%M:%S", # Adjusted format to match the datetime string cron_schedule="0 */6 * * *", # Six hourly partitions end_offset=int(15 * 24 / 6), # 15 days )
Make sure that the constants.STARBOARD_DATA_START_DATE
is also a string that matches the format "%Y-%m-%d %H:%M:%S"
. If it's a datetime
object, you'll need to convert it to a string using the same format when you pass it to the start
parameter. If you are using the fmt
parameter to format the partition keys that Dagster generates, then you should keep it as "%Y%m%d%H%M%S"
and ensure that any datetime strings you use elsewhere in your code are formatted to match this pattern. This would mean removing the separators from the datetime string before it's parsed or formatted.