Deploying Dagster with Helm Hi all, I’ve been pla...
# random
m
Deploying Dagster with Helm Hi all, I’ve been playing with Dagit locally and I’m beginning to feel productive with it. When it comes time to deploy my code I want to be sure it can handle the work given to it. I talked with a coworker about the best way to do this and he recommended K8s instead of ECS because it’s easier to stick with one or the other and K8s might end up being easier to provision+scale. I’ve never used Kubernetes before but I understand the complexity it comes with. We’re a small team here, it would only really be myself hacking away at EKS. What would you all recommend for me to watch/read/code to help understand Kube+AWS? I’m pretty good with infra in general now and I can get by with ECS, but I want to dig deeper. I want to know all I need to know before diving into this: https://docs.dagster.io/deployment/guides/kubernetes/deploying-with-helm
s
I like Kubernetes quite a bit, although I'm not sure it would be any easier. Have you looked into doing Dagster Cloud serverless for a bit? Going to brain dump what I recommend: 1. the hardest part IMO is standing up the infra (including networking) If you can get someone in infra to stand up the cluster for you that will simplify your life. 2. For permissions, the pods that are spun by the Dagster agent need access to container registry, S3 resourcse, etc. The way we do this is that we have an AWS IRSA role that we can add permissions to. A Kuberentes service account can then assume this role, and we just add permissions to the role when we need a new bucket, etc. 3. For Helm, I just have the
helm-prod-values.yml
files and
helm-stage-values.yml
file in a git repo and manually run them when i need to update the agent. 4. I would recommend against auto-scaling at first. This will prevent surprise errors where jobs are evicted due to scaling up/down the cluster, especially as you start 5. Would recommend connecting into some observability tooling like Datadog to monitor the cluster health. After writing this, I wonder if ECS wis your better bet 😂
z
ya if you're not using kubernetes for anything else I'd just use ECS. it really has no problem scaling, it can handle hundreds or thousands of parallel workers, I would advise against prematurely jumping to kubernetes unless you know there's something about ECS that doesn't work for your use case (or you have other tools / plan to have other tools using kubernetes). my team has put hundreds of thousands of dagster job hours in using the ECS run launcher with few issues, and it took like an hour to set up. and you wouldn't really lose anything by switch to kube later if really needed
m
Dagster Cloud serverless for a bit?
I would love to use it but with the sensitivity of our data and compliance reasons all of our data must stay in Canada + on our own servers. I do lean towards Serverless most times, for my own sanity
I would advise against prematurely jumping to kubernetes
I agree, this and other things kinda changed my mind about the whole thing.
After writing this, I wonder if ECS wis your better bet
Yeah, EC2 or ECS are quickly becoming the solution that makes the most sense here 😆 Thanks guys!
s
@Mitchell Hynes it might also be interesting for you to consider Dagster Cloud Hybrid, which is sort of a middle ground. Dagster Cloud deals with a lot of the long-running infra that is especially annoying to manage (log storage, metadata DB, RBAC) but you run an agent (in ECS most likely) so that the compute and data stay within your infosec teams guardrails I'd be happy to chat about this