Howdy, Dagster friends! I’m trying to wire up a `...
# ask-community
t
Howdy, Dagster friends! I’m trying to wire up a
dynamic_partitioned_config
and am curious if there’s a generally-accepted way of passing richer information to the partition config generation function than a
parition_key
can contain on its own. The old
Partition
class was a nice wrapper that allowed for a key or display name to be paired with rich information that could be used to configure a partitioned run, but it seems like the new paradigm makes the key and the information one and the same. (Oh, it looks like
Partition
is still in use internally but only the name is returned; any particular reason not to open that back up?)
Oh, looks like I could build my own
PartitionedConfig
since
DynamicPartitionsDefinition
seems to allow for a
partition_fn
that returns `Partition`s instead of strings… Giving that a go!
p
Yep! That should work! Curious, because I’m actively looking at the partitions/backfill API. What types of objects are you using for your partitions? Why is it so much better to use custom objects rather than string keys?
(Asking because these affect the APIs that we use between dagster framework code and user code - the custom objects typically cannot pass across a process boundary and so are more constrained)
t
My data is partitioned across tenants, not dates, and configuring a job for a given tenant is not as simple as passing the tenant name along; I need at least 4 bits of information to properly configure a run. I thought about shoving them all into the partition key (and parsing it on the other end), but then partition key space seems like it would explode over time and I’d generally like to keep runs for tenants grouped in the same partition (just tenant name) to make it easy to request backfills, etc.
The objects are basically just dictionaries. (Back to the original question you asked, haha.)
Building my own wasn’t too bad! Here’s the basic gist in case anyone else wants a pattern to follow:
Copy code
def partition_fn(_current_time):
    job_configs = build_job_configs()
    return [Partition(job_config, f"{job_config.tenant}-{job_config.env}") for job_config in job_configs]

def run_conf_fn(partition):
    job_config = partition.value
    return make_run_config(job_config)

return PartitionedConfig(
    partitions_def=DynamicPartitionsDefinition(partition_fn),
    run_config_for_partition_fn=run_conf_fn,
    decorated_fn=run_conf_fn,
)
I know the internal APIs aren’t really meant for external use, but it’s not entirely clear to me why the
PartitionedConfig
needs a
run_config_for_partition_fn
and also a
decorated_fn
when the same function works for both.
Also: the ability to choose a partitioned config from a dropdown in the Launchpad is DOPE. dagpurr
p
Oh I see. Yeah, this might be related to multi-dimensional partitioning (https://github.com/dagster-io/dagster/issues/4591), which is a long-standing issue that we’ve had in the back of our minds for a while now.
dagsir 1