Are you all thinking about what role Dagster shoul...
# dagster-feedback
p
Are you all thinking about what role Dagster should play in defining/managing streaming data workflows? I would think it’s not uncommon for people to have streaming workflows mixed in with batch ones across their stack - maybe some data ingestion is streaming and those assets are consumed by assets Dagster knows about and materializes. Or alternatively you could have a materialized view downstream of assets refreshed by Dagster. The former at least could be modeled as an observable source asset but that feels incomplete in the sense that Dagster knows nothing about the definition of the asset. I like the idea of the orchestration tool as a control plane and would like that to extend to streaming data workflows as well. Maybe Dagster isn’t responsible for keeping those assets updated but should still know about their definition, freshness, lineage, etc in the same way.
👀 2
d
Hey @Prratek Ramchandani, we are actively thinking about this. I think you pointed out the a couple pathways that we’d consider, 1. representing its lineage in our graphs, 2. adding metadata about status. We’re trying to form up our ideas, and starting to talk to people who have previously expressed interest in the topic, would love to follow up with you separately to have a live conversation if you’ve got time in the next week or so. I’ll DM you
D 1
p
nice! i'd be happy to chat
q
@Dagster Jarred These 2 points already would be great! Do you have any update or github discussion to follow up?
d
hey @Quentin Gaborit @Prratek Ramchandani we put this one on the back burner temporarily to focus on helping teams monitor and evaluate Dagster resource consumption (which jobs are running longest, using most credits, most retries, snowflake credits, etc).
p
that's exciting!