Mhd Mousa Hamad
07/19/2023, 7:19 AMScheduleDefinition
expects a JobDefinition
and not a callable, how can we create a lazy loaded schedule based on a lazy loaded job?
Unfortunately, the documentation is not extensive enough and the sample code provided doesn't run. Despite the syntax errors, there is still a bigger issue in the code which we couldn't fix without understanding how is the lazy-loading meant to be used. The expensive_schedule
is based on a job which is not added explicitly to the repository and thus Dagster complains when it starts.
@op(config_schema={n: Field(Int)})
def return_n(context):
return context.op_config['n']
######################################################################
# A lazy-loaded repository
######################################################################
def make_expensive_job():
@job
def expensive_job():
for i in range(10000):
return_n.alias(f'return_n_{i}')()
return expensive_job
def make_expensive_schedule():
@job
def other_expensive_job():
for i in range(11000):
return_n.alias(f'my_return_n_{i}')()
return ScheduleDefinition(cron_schedule="0 0 * * *", job=other_expensive_job)
@repository
def lazy_loaded_repository():
return {
'jobs': {'expensive_job': make_expensive_job},
'schedules': {'expensive_schedule': make_expensive_schedule}
}
We have fixed the config_schema
definition for return_n
and also set the name of the expensive_schedule
to match its key.
Could you please help me get this sample code running or explain to me how to create a lazy-loaded schedule based on a callable returning a JobDefinition
?
Thank you!Mhd Mousa Hamad
07/19/2023, 7:22 AMOdette Harary
07/21/2023, 2:49 PMdefinitions
and RunRequest
alex
07/21/2023, 3:04 PMScheduleDefinition
accepts a job_name
argument so you can use that to refer to the name of a job that is lazily constructedMhd Mousa Hamad
07/25/2023, 8:09 AMdefinitions
still does support lazy loading: https://github.com/dagster-io/dagster/issues/12476Mhd Mousa Hamad
07/25/2023, 8:16 AMScheduleDefinition
but I think we got lost with other details. It would very nice to have a support for this in relevant utility functions. E.g., we used build_schedule_from_partitioned_job
before and it is very convenient. I know that the in this case the job is not available and you need the run_config
but probably even this can be deferred until a later point in time so that the job is fully defined.