Hi, I see that there is now the possibility to lau...
# ask-community
m
Hi, I see that there is now the possibility to launch backfills that materialize a partition range in a single job. This is incredibly valuable to avoid destroying the parallelism when using a large spark cluster. Is there a way to do the same with a dynamic partition definition? It would be really useful to materialize a given partition set within the same job is there an open issue I can follow for this kind of feature request? Is it something you are already working on?
c
Hi Marco. This is possible when the selected dynamic partitions are a contiguous subset of the partitions def. You can click the "single run" button in Dagit to launch a range as a single run:
m
Thanks @claire, that is really interesting, can you point me to the details in the docs? I also have a further question, is it possible to launch a job that materialize such partition range programmatically possibly within a schedule? Is there an example in the docs?
c
This discussion has more details, and I commented an example of how you can programmatically kick off a run across a partition range in a schedule: https://github.com/dagster-io/dagster/discussions/11653#discussioncomment-5493977
h
@claire just on that example, (struggling with coming up with the right words here). how is a partition range defined, it's obvious for numbers and dates/times. is it a sorted copy of the partitions? say my partitions are ['cat', 'dog', 'mouse', 'zebra'] could I do dog to zebra (i.e. dog, mouse, zebra)?
after having a poke through the code, the likely issue will be my extensive use of dynamic partitions and the fact they are sorted by id and not value 😕
c
ah, so a valid partition range is any contiguous subset of partition keys when
partitions_def.get_partition_keys()
is called. For static partitions defs that would just be any contiguous subset of the list of partition keys provided, for dynamic partitions defs, the order is the order of creation