What’s the recommended pattern to build graph-back...
# ask-community
v
What’s the recommended pattern to build graph-backed
multi_assets
? Say I receive an email every so often and depending on the contents/attachments, it should rematerialize a subset of assets in a
@multi_asset
. This would mean kicking off a dynamic amount of computations, one for each attachment, preferably in parallel. Thinking in jobs/graphs, the pattern would be fairly straightforward; yielding
DynamicOutputs
from an op and calling
.map()
. I would love to split the computations into different
ops
connected through a
graph
, but am struggling to map the outputs from a
DynamicOut
to assets. Is it even possible to back
multi_assets
with a graph when using `DynamicOut`s? The following code snippet is the interim solution I landed on.
Copy code
@multi_asset(
    required_resource_keys={"foo", "bar"},
    outs={"baz": Out(is_required=False), "qux": Out(is_required=False)},
)
def my_cool_assets(context):
    res = context.resources.foo.fetch()
    for elem in res:
        processed_elem = context.resources.bar.process(elem)
        yield Output(value=processed_elem["data"], output_name=processed_elem["type"])
🤖 1
Also interested to know if there’s a way to pass a
key_prefix
to
@multi_asset
, doesn’t seem to be in the docs.
a
Based on the docs, while I think you can generate multiple assets with a graph backed asset, I don’t think it’s possible to only selectively materialize a subset of them. (I’m new to dagster so I may be wrong here).
c
Hm... currently it's not possible to dynamically generate assets, because there's no way to map a dynamic output to a certain output of a graph-backed asset. We also don't currently support subsetting graph-backed asset, but thats something we're hoping to enable soon. For now, the solution you posted that yields optional outputs is probably the way to go. We do have an example in our docs that does something like this: https://docs.dagster.io/concepts/assets/multi-assets#subsetting-multi-assets The only difference here is that the example passes
can_subset=True
into the @multi_asset decorator, which will enable functionalities such as selecting just one asset to materialize in Dagit out of your multi asset.
v
Thanks @claire, I did have the subset flag active, forgot to paste. Hoping the support gets extended in the future 🙂 How about the asset keys? I’d need to prefix the keys a specific way for the rest of my IO logic to work.
a
@Vinnie I tried something like this for the prefix, and it worked.
Copy code
@multi_asset(
    partitions_def=daily_partitions,
    can_subset=True,
    required_resource_keys={"my_client"},
    outs={
        "A": AssetOut(
            key_prefix=["raw"],
        ),
        "B": AssetOut(
            key_prefix=["raw"],
        ),
    }
)
@claire I still got this error
DagsterStepOutputNotFoundError: Core compute for op "my_assets" did not return an output for non-optional output "B"
when I already set
can_subset=True
. Could you help have a look? Thanks
v
Thanks, that’s helpful. For some reason I couldn’t find the
AssetOut
in the docs. The error seems to be from dagster thinking all assets are required. Should be able to fix it by adding
is_required=False
. From what I can see, the parameter is allowed: https://github.com/dagster-io/dagster/blob/master/python_modules/dagster/dagster/_core/definitions/asset_out.py#L63
🌈 1
a
Thanks @Vinnie. That also works 🙂
v
team work makes the dream work!
🌈 2
c
Looks like
key_prefix
is missing as a parameter on
@multi_asset
. I can file an issue to add this, and file another issue to document
AssetOut
@Dagster Bot issue add key_prefix to multi_asset
d
c
@Dagster Bot docs missing documentation for AssetOut
d