Does Dagster have plugin support? Hi all! I'm wor...
# ask-community
m
Does Dagster have plugin support? Hi all! I'm working on setting up some new pipelines and figuring out what tools to use. One pattern we have now is that the software team maintains the core ETL, and the data science contributes a few final stages to the pipeline to compute new metrics. The DS team contributions tend to change more rapidly, may not be as robust, and the DS team may want to re-run stages with new algorithms more frequently than the base pipeline run. So it would be nice to have separate versioning/releasing of pipeline stages from the different teams. Is there a way with Dagster to pull some portions of the pipeline from a separate repo or anything like that? (We could also manage this outside of the orchestration framework, like have CD combine portions of the pipeline from different sources, or have two dependent pipelines with one triggering the next.) Thanks in advance for advice!
r
Not a member of the Dagster team, but I think what you'd be looking for is repositories and workspaces
You can have various repositories of Dagster pipelines/schedules/sensors, and then pull them all into a workspace. The repositories do not have to live in the same location, i.e. they can be in different git repos and published to separate python packages
d
Hi Mark - your two solutions in parentheses for managing both make sense to me depending on the details. I think a lot of times you would see this kind of thing happen in CD and the various stages get built together into an image - it's pretty valuable to be able to associate a given pipeline run with a single image for versioning and debugging purposes. Multiple pipelines that are deployed independently could also make sense if there's a clear boundary point between the different stages
m
Thanks raaid, workspaces look like a great tool for this. And thanks daniel, good point about versioning everything together. If Dagster stores the version info from a repo (I'm imagining pip installing a repo and maybe locking to a version in the workspace.yaml) that could solve the problem, too.