Spencer Nelson

05/19/2023, 4:34 PM
I think Dagster needs better tools for the long-term lifecycle of a project. Let me explain with my scenario. I’m converting a bunch of old assets to a new underlying data model. Kind of a big old refactor. Instead certain assets being dataframes, they’ll be a richer type, called a
. Details aren’t important - what matters is that dataframes and tables are not compatible. So I have these old assets with keys like
, which are pandas dataframes. But I want to convert to `Table`s. Here are the options I see: 1. Reuse the
asset key, but return a non-dataframe type. Which is confusing, and will break dependents, which can possibly be managed with asset versioning in some way? But the name confusion would be unfortunate. 2. Write new
asset, change dependents to use it, and then delete
But this would destroy all history and orphan the materialized assets. Historical runs will be… broken? I don’t know what will happen to the dagit UI for them. 3. Write the new asset, but then keep
as relics of a bygone era. But they’ll clutter the UI and the codebase forever. Is there a way to mark assets as “archived” or “deprecated” or “just kept around in the attic?” Gradual migrations like this are really important. I think Dagster could provide tools to manage this, and they could be fantastically better than anything else out there, since Dagster knows so much about my computation graph. I don’t have a concrete suggestion but think this is an important area for new features.
❤️ 2

Joel Olazagasti

05/19/2023, 4:45 PM
There is an open issue for asset key migrations, which would alleviate this somewhat

Andras Somi

05/19/2023, 4:46 PM
Not sure this is a perfect answer to your need, but for solution #2 you could write a one-off job the yields an "empty" AssetMaterialization event for every missing partition/asset. This would kinda fake the materialization history of the new asset in the UI and the Dagster db (but wouldn't actually calculate/materialize that data).

Spencer Nelson

05/19/2023, 4:51 PM
Yes, asset key migrations help for sure. I think it would still be a bit funky. One thing I sort of, maybe, kind of want would be a historical record of the iomanager used to materialize old asset versions, just so they’re recoverable.