https://dagster.io/ logo
#ask-community
Title
# ask-community
o

Oren Lederman

05/22/2023, 9:34 PM
Hi, is it possible to create Multi-dimensionally partitioned Jobs, or it’s something that’s only officially supported by assets?
c

claire

05/22/2023, 9:41 PM
Hey Oren. It is supported to create multidimensional partitioned jobs
o

Oren Lederman

05/22/2023, 9:43 PM
Thanks Claire! Are there examples on how to use it? For static/dynamic/time-based partitions there are decorates that make is more clear
c

claire

05/22/2023, 9:48 PM
Here's a simple example if this helps:
Copy code
@op
def my_op():
    return 1


@job(
    partitions_def=MultiPartitionsDefinition(
        {
            "abc": StaticPartitionsDefinition(["a", "b", "c"]),
            "time": DailyPartitionsDefinition("2023-01-01"),
        }
    )
)
def my_job():
    my_op()
o

Oren Lederman

05/22/2023, 9:54 PM
Ummm. How do you use the partition keys inside the ops? In the other examples for partitioned jobs they use PartitionConfig in order to map the partition keys to op params.
Looks like I can do something like this:
Copy code
multi_part_def = MultiPartitionsDefinition(
    {
        "abc": StaticPartitionsDefinition(["a", "b", "c"]),
        "time": DailyPartitionsDefinition("2023-01-01"),
    }
)

multi_part_config = PartitionedConfig(
    partitions_def=multi_part_def,
    run_config_for_partition_fn=lambda partition: {
        "ops": {
            "process_data_for_date": {
                "config": {
                    "time": partition.value.keys_by_dimension["time"],
                    "abc": partition.value.keys_by_dimension["abc"],
                }
            }
        }
    },
)

@job(config=multi_part_config)
def do_stuff_partitioned():
    process_data_for_date()
c

claire

05/22/2023, 10:43 PM
You should be able to call context.partition_key to fetch the currently executing partition key:
Copy code
@op
def my_op(context):
   context.partition_key.keys_by_dimension
   ...
o

Oren Lederman

05/22/2023, 10:50 PM
Thanks Claire, I thought that might be the other option (reading it from the context as opposed to passing it via the configuration). It’s easier, but it means that the
ops
need to be aware that the job is defined as a partitioned job, which I prefer to avoid at the moment. I found how to create a PartitionedConfig though, and that should be good enough for what I currently need.
2 Views