Design questions with respect to data characterizations is t dagster #announcements

Design questions with respect to data characteriza...

John Helewa

08/11/2020, 7:49 PM

Design questions with respect to data characterizations, is that a role that can be filled with AssetMaterialization and the associated Metadata entries or should I put those characterization within my database schema outside of the DAGSTER framework? Use case: within my dagster pipelines, the information that is being produced or retrieved by the solids, needs to be characterized with meta-data about the purpose and hopefully with the state. For instance, I need to label the information as "Decisional" if it's to be used for making a decision. I need to label the information as type "Decision", if it the information was the result of a decision. I need to label the data as "Awaiting_Authorization" or "Authorized" if it is in those states. Reason behind this is I can calculate addition information from that, for instance I can answer how long a decision or authorization took by looking at the timestamp for related data with some helper from the dependency graph. I guess you can tell I have some human-in-the-loop threads here.

sandy

08/11/2020, 9:29 PM

Hey John - one thing to consider with asset materializations is that they're immutable. I.e. if you need to track state that changes over time, asset materializations would be a tough fit, unless you're yielding a new materialization with each state change

John Helewa

08/11/2020, 11:48 PM

@sandy Yeah, I was concerned about their being immutable. I just learned about them recently. In some cases I'm tracking tasks that could be long running (hours) and doing periodically updates. In situations like that we want the latest metrics of the task, including that it's state is ongoing. Probably best to support that outside of materialization. It was just the meta-data aspect of them caught my interest. Thanks.

Open in Slack

Previous Next