Sean Lopp
09/28/2022, 2:53 PM@asset
def a():
same_logic("a")
@asset
def b():
same_logic("b")
@asset
def c():
same_logic("c")
@asset(
ins = {"a": AssetIn(), "b": AssetIn(), "c": AssetIn()},
)
def all(a, b, c):
process(a,b,c)
I'm wondering if there is a more DRY approach to generating the assets? This doesn't seem to be exactly the use case for multi-assets, since each of a,b,c are generated by independent calls. I've considered programmatically yielding assets instead of using the decorator, but I wanted to check if I was missing anything obvious before heading down that path. I am also not sure of the best way to update all
when new ins are added. Thanks for the advice!claire
09/28/2022, 6:52 PMa, b, c...
is to define an asset factory, which might be what you mentioned about programmatically yielding assets:
def asset_factory(asset_keys: List[str]):
assets = []
for key in asset_keys:
@asset(name=key)
def my_asset():
same_logic(key)
assets.append(my_asset)
return assets
Sean Lopp
09/29/2022, 6:56 PMassets = asset_factory(asset_keys)
@asset(ins = {key: AssetIn() for key in asset_keys})
def final_asset(**assets):
stuff
Hendrik
10/02/2022, 6:20 PMfrom typing import List
from dagster import asset, repository
def asset_factory(asset_keys: List[str]):
assets = []
for key in asset_keys:
@asset(name=key)
def my_asset():
print(key)
assets.append(my_asset)
return assets
my_assets = asset_factory(["1","2","3","4"])
@repository
def my_asset_repo():
return my_assets
Sean Lopp
10/02/2022, 10:25 PMkey
inside of the function is evaluated after the loop, when the asset function is called. The name is set correctly though. I was able to use the following to get this to work:
for key in asset_keys:
@asset(
required_resource_keys={"snocountry_api"},
name=key,
io_manager_key="gcs_io_manager",
)
def my_asset(context) -> pd.DataFrame:
api = context.resources.snocountry_api
resort_id = get_resort_id_from_name(context.op_def.name)
report = api.get_resort(resort_id)
return pd.DataFrame(report)
assets.append(my_asset)
return assets
resort_assets = asset_factory(asset_keys, resorts)
When context.op_def.name
is evaluated when the funciton is called (not created) it correctly resolves to the value of key
Daniel Gafni
10/19/2022, 8:17 PMdef generate_assets(keys: List[str]) -> List[AssetsDefinition]:
assets = []
for key in keys:
@asset(
name=f"asset_{key}",
)
def my_asset(context: OpExecutionContext) -> str:
<http://context.log.info|context.log.info>(f"Processing key {key}")
return key
assets.append(my_asset)
return assets
assets = generate_assets(keys=["a", "b"])
The assets
get correct names (everything that goes into the asset
decorator is correct), but the compute_fn
is wrong - it's set to the last for loop run. In this case, all the assets return "b"
.
That's a Python thing... the following normal Python code behaves the same:
funcs = []
for key in ["a", "b"]:
def func():
return key
funcs.append(func)
funcs[0]()
Out[4]: 'b'
funcs[1]()
Out[5]: 'b'
sandy
10/19/2022, 9:44 PM@asset
in a separate function that's defined outside generate_assets
and that accepts a key, would that fix the problem?Daniel Gafni
10/19/2022, 11:30 PM