Hello dagster team! We’ve been researching some frameworks that support automating data pipelines. Our warehouse is Snowflake and current pipelines are written in
SchemaChange, which doesn’t scale well from a coordination perspective, that is when you start adding multiple developers.
Our plan is to start migrating to dbt core and amongst the frameworks we looked at, dagster’s dbt integration as software assets was very impressive. We also want to use a managed solution thus looking at dagster cloud. I signed up for the 30-day trial and because I couldn’t find a published set of IPs for serverless (we whitelist access to our Snowflake instances) I opted for a Hybrid configuration. It was pretty painless to setup a python virtual environment on my mac and kick off a local agent.
For the next step, we have other folks in the company that know their way around k8s and we have well established EKS clusters that live within VPCs that are whitelisted so I assume we could deploy our dagster agents to them. However, this brings up an issue that I am not quite clear about and want to be sure it’s supported.
We have 5 distinct long-running environments: dev, qa, stage, prod-us, and prod-eu. Each environment would be running some version of the same code — deployed via the master branch. The need for separate environments is relevant as other services around this data may have different release lifecycles.
I believe from what I’ve read,
dagster cloud can support this but requires an Enterprise license. I’ve been trying to understand how this would work as there would be daily/hourly incremental dbt jobs that would need to run in each of these environments. It makes it a little more challenging that I have zero practical experience with dagster; therefore, the reason for this very long post. Is what we need doable? Also, if we wanted to start off initially with serverless would it be doable as well, assuming that there is an available range of static IPs that we could whitelist. I’m aware from an earlier post I saw that there currently isn’t support in EU and that it’s being considered.
One last question for now -- If it is possible to establish distinct long running environments does dagster cloud support role-based access such that one can give users access to only certain environments.