Lech
11/28/2022, 1:24 PMpartitions_def= StaticPartitionsDefinition(key_list)
partitioned scheduler:
def get_schedule_def(partitions_def, cron_schedule, job, execution_timezone):
@schedule(name=f'{job.name}_scheduler',
cron_schedule=cron_schedule,
job=job,
default_status=DefaultScheduleStatus.STOPPED,
execution_timezone=execution_timezone)
def schedule_def():
partition_keys = partitions_def.get_partition_keys()
if len(partition_keys) == 0:
return SkipReason("The job's PartitionsDefinition has no partitions")
for key in partition_keys:
yield job.run_request_for_partition(partition_key=key, run_key=key)
return schedule_def
partitioned job creation:
job = asset.to_job(name=f"Job_{job_name}", resource_defs=resource_defs,
config=config, partitions_def=partitions_def, executor_def=in_process_executor,)
and job variable + get_schedule_def goes to the repository functionowen
11/28/2022, 5:43 PMget_schedule_def
? Also, how is key_list being updated?
One thing I noticed is that you've set run_key=key
inside the call to job.run_request_for_partition
, which means that once a schedule launches a run for a given partition key, it will never launch a run for that partition key again. If this is not desired behavior, then you can set run_key=None.Lech
11/28/2022, 7:57 PMowen
11/28/2022, 8:02 PMyield job.run_request_for_partition(partition_key=key, run_key=None)
(right now, you have run_key=key
). Run keys are unique identifiers for runs, and you can't create multiple runs from the same sensor that share the same run key. That feature doesn't seem to be useful for what you're trying to accomplish, so you can just skip that logic by setting run_key to None (but keeping the partition_key)Lech
11/28/2022, 8:11 PMrun_key=key
with run_key=None
, but in the "Schedules" tab I don't see more jobs, or when I enter the partitioned job I see Partition Set - None
build_schedule_from_partitioned_job
(but it works only with TimeWindowPartitionsDefinition
)owen
11/28/2022, 8:17 PM@repository
def my_repo():
return [
get_schedule_def(partitions_def, "0 0 * * *", my_job, "some-timezone"),
...
]
then I'd only expect to see a single schedule on that tab, regardless of what partitions that schedule is going to kick offLech
11/28/2022, 8:33 PM