sandy
10/14/2021, 4:23 PM@asset
and build_assets_job
- that enable constructing Dagster jobs that puts assets at the forefront.
A software-defined asset is a description, in code, of how to compute a particular asset - e.g. a table, ML model, or report. It inverts the typical relationship between data products and data pipelines. Instead of defining a graph of ops and recording which assets those ops materialize, you define a set of assets, each of which knows how to compute its contents from upstream assets.
Taking a software-defined asset approach has a few main benefits:
• Write less code - each asset knows about the assets it depends on, so you don't need to use @job
/ @pipeline
to wire up dependencies between your ops.
• Track cross-job dependencies via asset lineage - Dagit shows you the parents and children of assets, even if they live in different jobs. This helps locate the sources of problems and the consequences of changing or removing an asset.
• Learn when you need to take action on an asset - In a unified view, Dagster compares the assets you've defined in code to the assets you've materialized in storage. You can catch when an asset is stale and needs to be regenerated.
• dbt-native orchestration - software-defined assets match the mental model of tools like dbt, making it sensible for Dagster to ingest and orchestrate graphs of dbt models. A dbt model is essentially a software-defined asset.
For a fuller description of the new APIs, including code examples, take a look the Github discussion. For a fuller example, take a look at this asset-based version of the Hacker News demo jobs.
We're very excited about this direction, but it is still early days. We're looking for early design partners to work with us to make this a reality. If you want to take advantage of this approach and help shape the future of the Dagster product, please reach out to either me or Nick directly.Jason
10/14/2021, 5:01 PMschrockn
10/14/2021, 5:40 PMJason
10/14/2021, 6:21 PMDonny Winston
10/16/2021, 12:57 AMmake
. A classic pattern. Seems like it could be very useful. I’m glad 0.13.0 is coming soon. I’ve been liking the graph/op/job model.David Loewenstern
10/21/2021, 5:15 PMgeoHeil
10/25/2021, 2:49 PMsandy
10/25/2021, 3:32 PM