https://dagster.io/ logo
Title
s

Sean Lopp

09/28/2022, 2:53 PM
I am looking for guidance on multi-asset usage. I have the following asset definitions (simplified) which work well:
@asset
def a():
  same_logic("a")

@asset
def b():
  same_logic("b")

@asset
def c():
  same_logic("c")

@asset(
   ins = {"a": AssetIn(), "b": AssetIn(), "c": AssetIn()},
)
def all(a, b, c):
   process(a,b,c)
I'm wondering if there is a more DRY approach to generating the assets? This doesn't seem to be exactly the use case for multi-assets, since each of a,b,c are generated by independent calls. I've considered programmatically yielding assets instead of using the decorator, but I wanted to check if I was missing anything obvious before heading down that path. I am also not sure of the best way to update
all
when new ins are added. Thanks for the advice!
c

claire

09/28/2022, 6:52 PM
Hi Sean. One thing you can consider to remove some duplication with assets
a, b, c...
is to define an asset factory, which might be what you mentioned about programmatically yielding assets:
def asset_factory(asset_keys: List[str]):
    assets = []
    for key in asset_keys:

        @asset(name=key)
        def my_asset():
            same_logic(key)

        assets.append(my_asset)
    
    return assets
👍 1
s

Sean Lopp

09/29/2022, 6:56 PM
This worked great, thanks @claire For anyone who revisits the asset factory pattern, if a downstream asset needs to depend on all the assets from the asset factory, this can be done as:
assets = asset_factory(asset_keys)

@asset(ins = {key: AssetIn() for key in asset_keys})
def final_asset(**assets):
   stuff
:rainbow-daggy: 1
h

Hendrik

10/02/2022, 6:20 PM
Unfortunately the for-loop solution doesn't work for me. Every time the assets created within this loop take the key inside the list for same_logic(key) E.g. Output of material. of asset 2 is "4". Any thoughts on this?
from typing import List

from dagster import asset, repository


def asset_factory(asset_keys: List[str]):
    assets = []
    for key in asset_keys:

        @asset(name=key)
        def my_asset():
            print(key)

        assets.append(my_asset)

    return assets


my_assets = asset_factory(["1","2","3","4"])

@repository
def my_asset_repo():
    return my_assets
s

Sean Lopp

10/02/2022, 10:25 PM
Yea, the value of
key
inside of the function is evaluated after the loop, when the asset function is called. The name is set correctly though. I was able to use the following to get this to work:
for key in asset_keys:

        @asset(
            required_resource_keys={"snocountry_api"},
            name=key,
            io_manager_key="gcs_io_manager",
        )
        def my_asset(context) -> pd.DataFrame:
            api = context.resources.snocountry_api
            resort_id = get_resort_id_from_name(context.op_def.name)
            report = api.get_resort(resort_id)
            return pd.DataFrame(report)

        assets.append(my_asset)
    
    return assets

resort_assets = asset_factory(asset_keys, resorts)
When
context.op_def.name
is evaluated when the funciton is called (not created) it correctly resolves to the value of
key
❤️ 1
d

Daniel Gafni

10/19/2022, 8:17 PM
I'm having the same issue.
def generate_assets(keys: List[str]) -> List[AssetsDefinition]:
    assets = []

    for key in keys:

        @asset(
            name=f"asset_{key}",
        )
        def my_asset(context: OpExecutionContext) -> str:
            <http://context.log.info|context.log.info>(f"Processing key {key}")
            return key

        assets.append(my_asset)

    return assets

assets = generate_assets(keys=["a", "b"])
The
assets
get correct names (everything that goes into the
asset
decorator is correct), but the
compute_fn
is wrong - it's set to the last for loop run. In this case, all the assets return
"b"
. That's a Python thing... the following normal Python code behaves the same:
funcs = []
for key in ["a", "b"]:
    def func():
        return key
    funcs.append(func)
    
funcs[0]()
Out[4]: 'b'
funcs[1]()
Out[5]: 'b'
Using context.op_def.name to pass data seems more like a hack and a bad solution. Can anybody recommend a better way of doing this? Also, doesn’t seem like the asset config is the correct place to pass this data too. I want different assets to be materialized with different functions consistently. @sandy maybe you can help?
s

sandy

10/19/2022, 9:44 PM
if you were to wrap the
@asset
in a separate function that's defined outside
generate_assets
and that accepts a key, would that fix the problem?
d

Daniel Gafni

10/19/2022, 11:30 PM
Yep! Thanks