Hi I see that there is now the possibility to launch backfil dagster #ask-community

Hi, I see that there is now the possibility to lau...

Marco Jacopo Ferrarotti

03/31/2023, 5:48 AM

Hi, I see that there is now the possibility to launch backfills that materialize a partition range in a single job. This is incredibly valuable to avoid destroying the parallelism when using a large spark cluster. Is there a way to do the same with a dynamic partition definition? It would be really useful to materialize a given partition set within the same job is there an open issue I can follow for this kind of feature request? Is it something you are already working on?

claire

03/31/2023, 8:56 PM

Hi Marco. This is possible when the selected dynamic partitions are a contiguous subset of the partitions def. You can click the "single run" button in Dagit to launch a range as a single run:

Marco Jacopo Ferrarotti

03/31/2023, 10:01 PM

Thanks @claire, that is really interesting, can you point me to the details in the docs? I also have a further question, is it possible to launch a job that materialize such partition range programmatically possibly within a schedule? Is there an example in the docs?

claire

03/31/2023, 10:19 PM

This discussion has more details, and I commented an example of how you can programmatically kick off a run across a partition range in a schedule: https://github.com/dagster-io/dagster/discussions/11653#discussioncomment-5493977

Harrison Conlin

04/01/2023, 4:35 AM

@claire just on that example, (struggling with coming up with the right words here). how is a partition range defined, it's obvious for numbers and dates/times. is it a sorted copy of the partitions? say my partitions are ['cat', 'dog', 'mouse', 'zebra'] could I do dog to zebra (i.e. dog, mouse, zebra)?

Harrison Conlin

04/02/2023, 1:48 AM

after having a poke through the code, the likely issue will be my extensive use of dynamic partitions and the fact they are sorted by id and not value 😕

claire

04/03/2023, 4:14 PM

ah, so a valid partition range is any contiguous subset of partition keys when

partitions_def.get_partition_keys()

is called. For static partitions defs that would just be any contiguous subset of the list of partition keys provided, for dynamic partitions defs, the order is the order of creation

2 Views

Open in Slack

Previous Next