What s the recommended pattern to build graph backed `multi dagster #ask-community

What’s the recommended pattern to build graph-back...

Vinnie

08/10/2022, 10:08 AM

What’s the recommended pattern to build graph-backed

multi_assets

? Say I receive an email every so often and depending on the contents/attachments, it should rematerialize a subset of assets in a

@multi_asset

. This would mean kicking off a dynamic amount of computations, one for each attachment, preferably in parallel. Thinking in jobs/graphs, the pattern would be fairly straightforward; yielding

DynamicOutputs

from an op and calling

.map()

. I would love to split the computations into different

ops

connected through a

graph

, but am struggling to map the outputs from a

DynamicOut

to assets. Is it even possible to back

multi_assets

with a graph when using `DynamicOut`s? The following code snippet is the interim solution I landed on.

Copy code

@multi_asset(
    required_resource_keys={"foo", "bar"},
    outs={"baz": Out(is_required=False), "qux": Out(is_required=False)},
)
def my_cool_assets(context):
    res = context.resources.foo.fetch()
    for elem in res:
        processed_elem = context.resources.bar.process(elem)
        yield Output(value=processed_elem["data"], output_name=processed_elem["type"])

🤖 1

Vinnie

08/10/2022, 10:16 AM

Also interested to know if there’s a way to pass a

key_prefix

@multi_asset

, doesn’t seem to be in the docs.

Aaron

08/10/2022, 6:05 PM

Based on the docs, while I think you can generate multiple assets with a graph backed asset, I don’t think it’s possible to only selectively materialize a subset of them. (I’m new to dagster so I may be wrong here).

claire

08/10/2022, 7:39 PM

Hm... currently it's not possible to dynamically generate assets, because there's no way to map a dynamic output to a certain output of a graph-backed asset. We also don't currently support subsetting graph-backed asset, but thats something we're hoping to enable soon. For now, the solution you posted that yields optional outputs is probably the way to go. We do have an example in our docs that does something like this: https://docs.dagster.io/concepts/assets/multi-assets#subsetting-multi-assets The only difference here is that the example passes

can_subset=True

into the @multi_asset decorator, which will enable functionalities such as selecting just one asset to materialize in Dagit out of your multi asset.

Vinnie

08/11/2022, 5:17 AM

Thanks @claire, I did have the subset flag active, forgot to paste. Hoping the support gets extended in the future 🙂 How about the asset keys? I’d need to prefix the keys a specific way for the rest of my IO logic to work.

Averell

08/11/2022, 2:10 PM

@Vinnie I tried something like this for the prefix, and it worked.

Copy code

@multi_asset(
    partitions_def=daily_partitions,
    can_subset=True,
    required_resource_keys={"my_client"},
    outs={
        "A": AssetOut(
            key_prefix=["raw"],
        ),
        "B": AssetOut(
            key_prefix=["raw"],
        ),
    }
)

Averell

08/11/2022, 2:12 PM

@claire I still got this error

DagsterStepOutputNotFoundError: Core compute for op "my_assets" did not return an output for non-optional output "B"

when I already set

can_subset=True

. Could you help have a look? Thanks

Vinnie

08/11/2022, 2:13 PM

Thanks, that’s helpful. For some reason I couldn’t find the

AssetOut

in the docs. The error seems to be from dagster thinking all assets are required. Should be able to fix it by adding

is_required=False

. From what I can see, the parameter is allowed: https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/_core/definitions/asset_out.py#L63

🌈 1

Averell

08/11/2022, 2:23 PM

Thanks @Vinnie. That also works 🙂

Vinnie

08/11/2022, 2:23 PM

team work makes the dream work!

🌈 2

claire

08/11/2022, 4:35 PM

Looks like

key_prefix

is missing as a parameter on

@multi_asset

. I can file an issue to add this, and file another issue to document

AssetOut

claire

08/11/2022, 4:35 PM

@Dagster Bot issue add key_prefix to multi_asset

Dagster Bot

08/11/2022, 4:35 PM

Created issue at: https://github.com/dagster-io/dagster/issues/9344

claire

08/11/2022, 4:36 PM

@Dagster Bot docs missing documentation for AssetOut

Dagster Bot

08/11/2022, 4:36 PM

Created issue at: https://github.com/dagster-io/dagster/issues/9345

6 Views

Open in Slack

Previous Next