I'm having a lot of trouble with partitioned confi...
# dagster-feedback
s
I'm having a lot of trouble with partitioned config, it seems like it should be perfect for my use case on paper: I generate the same type of asset for a number of configurations, where the configurations can be enumerated with a string. But I keep getting lost using the feature. So I started with something like this:
Copy code
CONFIG_KEYS = ['a', 'b', ...] 

@op
def load_config(config_key: str):
    # use a resource to load the config by key
    return {}

@asset(partitions_def=StaticPartitionsDefinition(CONFIG_KEYS))
def do_something(context):
    config = load_config(context.partition_key);
    # run the rest of the graph
    pass
I wasn't entirely happy with this. I would like leverage the Dagster Config abstraction. PartitionedConfig sounded like just the ticket on paper, but I realize it's actually strangely much more verbose for the same end result:
Copy code
@static_partitioned_config(partition_keys=CONFIG_KEYS)
def my_config(partition_key: str):
    # use a resource to load the config by key
    config = {}
    return { 
        "ops": { 
            "load_config": { 
                "config": config
            }
        }
    }

@op
def load_config(context, config: my_config):
    return config

@job
def do_something(context):
    config = load_config()
    # run the rest of the graph
    pass
Why do I have to map to ops in my_config? Worse, I can't actually figure out how to provide the schema for my config with this setup. It seems very vaguely covered in the documentation. I must be totally misunderstanding this feature.
s
Hi Sean, I agree
PartitionedConfig
is not very well-documented. I think the main point of confusion here is between run config and op/asset config. Run config is the full set of configuration bindings for a run. It includes config for logging, execution, and more. It also includes the config for each op/asset under the top-level “ops” key.
PartitionedConfig
is for generating run config from a partition key. That is why when you use
PartitionedConfig
, you need to specify “ops”.
Worse, I can’t actually figure out how to provide the schema for my config with this setup.
Here’s a minimal example of how to use
PartitionedConfig
to configure a simple job with a single asset:
Copy code
from dagster import (
    Config,
    Definitions,
    StaticPartitionsDefinition,
    asset,
    define_asset_job,
    static_partitioned_config,
)

PARTITION_KEYS = ["a", "b"]


@static_partitioned_config(partition_keys=PARTITION_KEYS)
def my_config(partition_key: str):
    return {"ops": {"do_something": {"config": {"some_param": f"hello {partition_key}"}}}}


class DoSomethingConfig(Config):
    some_param: str


@asset(partitions_def=StaticPartitionsDefinition(PARTITION_KEYS))
def do_something(context, config: DoSomethingConfig):
    return config


my_job = define_asset_job("my_job", [do_something], config=my_config)

defs = Definitions(assets=[do_something], jobs=[my_job])
s
Thanks for the response @sean. I feel like this isn't something I'd actually ever use. Seems like such a niche abstraction to centralize a mapping of all op configs to partition key. Personally I feel like the individual map of an op config to an op by partition key is useful and far more intuitive. Really appreciate the support though, cleared up my understanding.
s
That’s fair-- Dagster’s config system has been getting overhauled recently and could probably benefit from some love here. The “centralized” architecture of
PartitionedConfig
is probably due to history, as there was a time when all executions were initiated through a job, which required specifying the entire run config at once. The introduction of assets changed that and made more granular executions (like materialization of one asset) more of a first-class use case. cc @sandy, has the idea of an asset/op-scoped
PartitionedConfig
been considered?
s
Now I'm reading up on IOManagers and got to "Providing per-input config to an input manager" having deja vu. I think maybe the high level feedback is that it feels like there's 8 ways to do everything and I'm left just feeling overwhelmed with options + not knowing how my particular need matches with an option