are asset factories an anti-pattern or best practi...
# dagster-feedback
r
are asset factories an anti-pattern or best practice? I Frequently find the need to build x assets with the same core logic but different parameters. E.g. let's say I have 10x configuration objects
complex_config_x
, and a function
config_to_asset(cfg)
, how do I turn this into 10 assets in the most dagsteronic way? Factories? Configured assets? Graph assets with configured ops? Partitions?
n
We're using a lot of factories for all the Dagster objects (assets, ops, sensors), it's also mentioned in the doc, I think it's a valid pattern. It does add a layer of complexity to the code, but remaining DRY is more important.
👍 2
👍🏻 1
d
Same here - using factories to generate N assets from a static config list
r
yes, I've been using them extensively already, was just wondering with all the work going into configuration whether there was a way to make it more explicit that the assets were using the same underlying definition. Then again, I also find that I often need to dynamically build dependencies and the like, so might be no way around the factory
s
I think factories are generally considered a good practice. One potential drawback I could think of is that they don't explicitly allow the UI to represent the fact that two different assets share the same underlying template - is that important to you?
D 1
l
I also have a wealth of assets that are configured as yml entries and generated by factories. It works really well to hide the configuration interface from users, just giving them a button to push when they need a materialization. I've kind of hijacked/repurposed the
compute_kind
label to denote what kind of template it's following. It's a nice visible label and it doesn't force users to stick to a certain key prefix or asset group.
r
I think it's useful to be able to identify similar assets in the UI, similar to what Leo is suggesting with using the badges (edge annotations could also be nice at some point like "clean", "copy", etc) My main reason for raising the topic was that the factory pattern is passing configuration in a somewhat oblique fashion. At the moment, I need to build a sequence of assets that are produced in a serial fashion and need to be aware of prior history, so I have a factory like the following:
Copy code
config = PydanticModel()

    def chain_factory(config, last_chain_asset):
        @asset(
            ins={"last_chain_link": AssetIn(key=last_chain_asset.key)},
            metadata={"config": MetadataValue.json(config.dict())},
        )
        def chain_node_asset(last_chain_link):
            # config passed as closure
            return asset_from_config(config, last_chain_link)

    next_asset = chain_factory(config, first_asset)
but that means that we completely bypass all notions of dagster configuration and only keep an informal record in the metadata. Could this be done otherwise?