https://dagster.io/ logo
Title
c

Chris Histe

11/18/2022, 1:14 PM
Hello there, I tried finding best practices regarding AssetKey but couldn’t find anything in the docs. Would be great to have a dedicated page for that. In the meantime what are the recommendations on AssetKey. Here are a few questions that I can’t really answer myself. Thanks in advance! • Should we always use the default one created by
@asset
? • What about for
@multi_asset
? I think we must explicitly define AssetKey in that case • Should we use different keys if the asset was materialised within a job vs directly? • Is there a recommend path e.g
name_of_repository/name_of_job/name_of_asset/name_of_partitions
? • When to use
key
vs
key_prefix
? • What’s the default path? Especially when running with partitioned jobs • Is there a way to programatically get the current AssetKey within a decorated function? It doesn’t seem like the context has a property for that. I’d like to use it for my AssetObservation.
1
c

claire

11/18/2022, 11:39 PM
Hi Chris, these are all really good questions and would definitely be valuable to document in a guide (cc @erin). I can try to take a stab at these questions in chunks:
• The most straightforward and ergonomic way to define an asset key is the default definition using
@asset
. There are situations where you'd want to add fields which are mostly used to differentiate between many assets. For example, in a large organization you might add a key prefix containing your team name to avoid key collisions. • The major use case for `@multi_asset`s is for generating assets whose computation is hard to separate by key. One example is for integrations, where you might run a single command in dbt that outputs two assets. • I recommend grouping all of the assets in a folder together, regardless of the job it belongs in. It's easier to define relationships between assets this way and you can use helper methods like
load_assets_from_package_modules
. Within this folder, you can create separate sub-folders to separate assets by business domain. • As mentioned earlier, I think
key_prefixes
are the most helpful for avoiding collisions. In dagit, assets are also placed in a sidebar following a folder structure organized by
key_prefixes
. The intent here is to be able to separate assets by domain. • Not sure what you mean by the default path for partitioned jobs. I think the same practices apply for regular assets and partitioned assets. • It is possible to get an asset key from the context, via
context.asset_key_for_output
.
c

Chris Histe

11/22/2022, 1:16 PM
This is really helpful thanks @claire. Am I understanding correctly that we should not set the AssetKey directly but only the
key_prefix
?
c

claire

11/22/2022, 4:46 PM
It all compiles down to the same thing! So passing
key_prefix
versus
AssetKey
is the same
c

Chris Histe

11/22/2022, 5:12 PM
Ok, thanks 🙂