https://dagster.io/ logo
#dagster-feedback
Title
# dagster-feedback
w

William

08/11/2022, 2:53 PM
I think
software-defined assets
are great ideas but in a lot of cases my
job
run generates different data outputs with different run configs. Therefore, we cannot declare all of them at definition time. Now I could only use
context.log_event(AssetMaterialization)
as means of annotation. How could I achieve declarative job run using jobs and ops? By declarative I mean sth like assets or cmake targets: making dagster aware of what’s generated & available, then only run those necessary.
s

Stephen Bailey

08/11/2022, 3:31 PM
One way you can do this is with asset factories. Basically, create a function that accepts configuration and generates separate assets based on the provided configuration. This allows you to reuse the same logic but make many different assets.
Copy code
def create_asset(custom_name, custom_value):

    @asset(name=custom_name)
    def _generated_asset(context):
        context.add_output_metadata({"value": custom_value})
        return custom_value

    return _generated_asset

asset_list = []
for k, v in [("foo", 1), ("bar", 2")]:
    a = create_asset(k, v)
    asset_list.append(a)
❤️ 1
we use this pattern in conjunction with yaml files to generate templated pipelines for our ml use cases
s

sandy

08/11/2022, 4:00 PM
By declarative I mean sth like assets or cmake targets: making dagster aware of what’s generated & available, then only run those necessary.
Mind expanding on this a little more? Is the idea that, once an asset has been dynamically generated once, you'd like to be able see it in the asset graph and launch re-materializations of it? This (not yet implemented) might be what you're looking for? https://github.com/dagster-io/dagster/issues/7943
s

Samuel Stütz

08/11/2022, 4:36 PM
I have used the assets factory pattern and its is cool to have a list of configs render to a number of assets. The unique resource definition “keys/defs” mess with the code a fair bit. As while I can duplicate a number of assets under different prefixs, I would also need to give them each (group) different resource configs (auth credentials). For the config per op I can and I also found that one can write ops and then generate the assets from those via. AssetDefinitions.from_op