Rene Czepluch
09/02/2023, 10:36 AM<https://hacker-news.firebaseio.com/v0/topstories.json>
This is saved via:
@asset # add the asset decorator to tell Dagster this is an asset
def topstory_ids() -> None:
newstories_url = "<https://hacker-news.firebaseio.com/v0/topstories.json>"
top_new_story_ids = requests.get(newstories_url).json()[:100]
os.makedirs("data", exist_ok=True)
with open("data/topstory_ids.json", "w") as f:
json.dump(top_new_story_ids, f)
but what if we had another source? Like <https://hacker-news.firebaseio.com/v1/topstories.json>
? Made up example, but I have tried:
newstories = [
"<https://hacker-news.firebaseio.com/v0/topstories.json>",
"<https://hacker-news.firebaseio.com/v1/topstories.json>"
]
for newstory in newstories:
i = 0
@asset(name = f"func_{i}", group_name = "same_group")
def topstory_ids() -> None:
newstories_url = newstory
top_new_story_ids = requests.get(newstories_url).json()[:100]
os.makedirs("data", exist_ok=True)
with open("data/topstory_ids.json", "w") as f:
json.dump(top_new_story_ids, f)
i += 1
this just results in a single asset.DB
09/03/2023, 11:11 AMfrom dagster import asset, AssetsDefinition, Definitions
def generate_asset(name: str) -> AssetsDefinition:
@asset(name=name)
def _asset() -> None:
...
return _asset
asset_names = ["foo", "bar"]
assets = [generate_asset(x) for x in asset_names]
defs = Definitions(assets=assets)
Edit: it does seem like the helper functions (like load_assets_from_package_module
) can deal with loops, but you still have to create a list of assets. In your example you overwrite the same name topstory_ids
with different assets, so only one can be found. You could try:
...
assets = []
for newstory in newstories:
...
def topstory_ids():
...
...
assets.append(topstory_ids)
Rene Czepluch
09/04/2023, 7:53 AM