Dagster 0.12.12 introduced a pair of experimental ...
# announcements
Dagster 0.12.12 introduced a pair of experimental APIs for building Dagster jobs via “software-defined assets”. Dagster is advancing in some big ways. Next week, we’ll release 0.13.0, which lays down a new foundation in the form of the graph, job, and op APIs. But this is only the start of our roadmap. On top of this foundation, we’ve been working on some big new capabilities. One of these is software-defined assets. It revolves around two APIs -
- that enable constructing Dagster jobs that puts assets at the forefront. A software-defined asset is a description, in code, of how to compute a particular asset - e.g. a table, ML model, or report. It inverts the typical relationship between data products and data pipelines. Instead of defining a graph of ops and recording which assets those ops materialize, you define a set of assets, each of which knows how to compute its contents from upstream assets. Taking a software-defined asset approach has a few main benefits: • Write less code - each asset knows about the assets it depends on, so you don't need to use
to wire up dependencies between your ops. • Track cross-job dependencies via asset lineage - Dagit shows you the parents and children of assets, even if they live in different jobs. This helps locate the sources of problems and the consequences of changing or removing an asset. • Learn when you need to take action on an asset - In a unified view, Dagster compares the assets you've defined in code to the assets you've materialized in storage. You can catch when an asset is stale and needs to be regenerated. • dbt-native orchestration - software-defined assets match the mental model of tools like dbt, making it sensible for Dagster to ingest and orchestrate graphs of dbt models. A dbt model is essentially a software-defined asset. For a fuller description of the new APIs, including code examples, take a look the Github discussion. For a fuller example, take a look at this asset-based version of the Hacker News demo jobs. We're very excited about this direction, but it is still early days. We're looking for early design partners to work with us to make this a reality. If you want to take advantage of this approach and help shape the future of the Dagster product, please reach out to either me or Nick directly.
👌 23
🎉 1
Out of curiosity - what "stage" of current Dagster usage would you recommend for a data team to start with or try the transition over to the graphs and software-defined assets mindset? E.g. • If you just installed Dagster - Go to graphs and assets • If you have 100+ prod pipelines - Wait until things are "stable" randomly picking v0.14.0 • Doesn't matter because everything can co-mingle
Hey @Jason great question. For jobs, ops, and graphs (JOG), new users should just start using them. We are pushing out 0.13.0 next week, at which point all of our tutorials and docs will be JOG-focused and they will be the new blessed, supported APIs.
🎉 1
👍🏾 1
👍 2
For software-defined assets, (i.e. @asset) that is experimental and subject to change, and would recommend talking to us before using them in a serious way.
Awesome, thanks for the clarity on that!
“Software-defined assets” reminds me of
. A classic pattern. Seems like it could be very useful. I’m glad 0.13.0 is coming soon. I’ve been liking the graph/op/job model.
Reminds me a lot of dbt.
https://github.com/dagster-io/dagster/discussions/5024 mentions that partitioned assets are not yet supported. Do you already have an estimate when you will be able to support them?
hey @geoHeil - we don't yet have an ETA, partly because we're still figuring out what the requirements look like. I'm going to DM you to ask about how you'd ideally use partitioned assets.
🎉 1
❤️ 1