https://dagster.io/ logo
Title
k

kyle

10/29/2022, 4:20 PM
I want to create a partitioned job where the partition key is a primary key of a table in a database. That way I can launch the job for each PK and run backfills. I suppose this should use a
StaticPartitionsDefinition
but i don’t understand how to set those in advance as the table grows.
r

Rainer Pichler

10/31/2022, 8:44 AM
Hello kyle! As I understand it, this would only be possible via a dynamic_partitioned_config. But still in the supplied function you have to know the concrete partition values.
k

kyle

11/01/2022, 12:06 AM
So you are suggesting I set up the dynamic_partitioned_config partition_fn to query my database for the set of primary keys?
I got that working thanks for your input! Although i have about 100k keys so it is a little slow to load the partitions. What do you think of setting up the partition function to load the keys from an asset, and update the asset a few times a day?
As a separate scheduled job.
r

Rainer Pichler

11/02/2022, 9:50 AM
Could work. Though in my use case, I was able to derive the partition keys programmatically. Note that I'm a Dagster user and not a developer.