Hi, We trying to use lazy loading for a repository...
# ask-community
m
Hi, We trying to use lazy loading for a repository and managed to get the jobs to be lazy loaded, but now we are facing issues with the schedules. If the
ScheduleDefinition
expects a
JobDefinition
and not a callable, how can we create a lazy loaded schedule based on a lazy loaded job? Unfortunately, the documentation is not extensive enough and the sample code provided doesn't run. Despite the syntax errors, there is still a bigger issue in the code which we couldn't fix without understanding how is the lazy-loading meant to be used. The
expensive_schedule
is based on a job which is not added explicitly to the repository and thus Dagster complains when it starts.
Copy code
@op(config_schema={n: Field(Int)})
def return_n(context):
    return context.op_config['n']

######################################################################
# A lazy-loaded repository
######################################################################

def make_expensive_job():
    @job
    def expensive_job():
        for i in range(10000):
            return_n.alias(f'return_n_{i}')()

    return expensive_job

def make_expensive_schedule():
    @job
    def other_expensive_job():
        for i in range(11000):
            return_n.alias(f'my_return_n_{i}')()

    return ScheduleDefinition(cron_schedule="0 0 * * *", job=other_expensive_job)

@repository
def lazy_loaded_repository():
    return {
        'jobs': {'expensive_job': make_expensive_job},
        'schedules': {'expensive_schedule': make_expensive_schedule}
    }
We have fixed the
config_schema
definition for
return_n
and also set the name of the
expensive_schedule
to match its key. Could you please help me get this sample code running or explain to me how to create a lazy-loaded schedule based on a callable returning a
JobDefinition
? Thank you!
I would also highly appreciate updating the documentation. I am happy to do so if I managed to understand how is this meant to be used.
o
Hey - have you tried using
definitions
and
RunRequest
a
ScheduleDefinition
accepts a
job_name
argument so you can use that to refer to the name of a job that is lazily constructed
👍 1
m
@Odette Harary thank you for your reply but as far as I know
definitions
still does support lazy loading: https://github.com/dagster-io/dagster/issues/12476
@alex thank you for your answer. This did work. It is unfortunate that I did not notice this in the
ScheduleDefinition
but I think we got lost with other details. It would very nice to have a support for this in relevant utility functions. E.g., we used
build_schedule_from_partitioned_job
before and it is very convenient. I know that the in this case the job is not available and you need the
run_config
but probably even this can be deferred until a later point in time so that the job is fully defined.