When you move from an older version of Dagster to ...
# ask-community
w
When you move from an older version of Dagster to a newer one. How do you make sure that the Dagster database tables and columns are also ugraded? Is there a SQL script to run?
r
We have a guide for our kubernetes deployments: https://docs.dagster.io/deployment/guides/kubernetes/how-to-migrate-your-instance If you want to run this manually, you can run
dagster instance migrate
in an environment that has access to a
dagster.yaml
w
Ok, that makes sense. It's a bit strange to me having this doc for Kubernetes but not for normal deployment which I assume more people are using Dagster with.
Wait, where do I specify which version I want to migrate to??
r
You don’t specify a version - once you upgrade your
dagster
python version (e.g. from 0.13.0 to 0.14.0), the new migration information is within that version.
dagster instance migrate
will ensure that any missing migrations for your tables are added.
w
Ah, I see. Thank you!
r
re: your other comment, migrations are usually only run on production deployments, and our Kubernetes deployment is our recommended way for doing this.
w
You recommend running production Dagster only on Kubernetes? Any negatives if I just run it within a plain EC2 server?
r
ah, misspoke, it’s one of our recommended ways. our ecs deployment is pretty stable as well, you can check out #dagster-ecs for that. cc @jordan for the nuances around EC2
j
Echoing Rex, you can really deploy Dagster just about anywhere but our recommendation is to use some kind of container orchestration system. This is because production deployments of Dagster typically require a variety of long-lived daemonized processes so you’ll benefit from some of the reproducibility/deployability/keep-alive/fault tolerance/scalability benefits that the various systems provide. Our K8s, ECS, and Docker deployments are all well supported: https://docs.dagster.io/deployment#hands-on-guides-to-deploying-dagster You could run the Docker deployment directly on an ec2 instance if you wanted to. And of course, if your project doesn’t require the complexity tradeoff, you can run things in process on an ec2 instance as well.
w
@jordan thank you for the info. I see that the k8 deployment is kinda popular around here. What is the specific advantage of using k8s in terms of managing long-lived processes? Is it because you can scale horizontally or because it is good at keeping those processes alive. Related question, what about Dagit? Do people run multiple Dagit 1:1 to dagster-daemon even on k8?
j
All of the above and then some - but I wouldn’t over-index on K8s if it’s not something you’re already using. Our endorsement is more for containerization in general than it is for any particular container orchestrator - K8s just happens to be the one that we built out support for first. Historically, a lot of our functionality was implemented first for K8s and only later for Docker and ECS. There’s increasingly parity in our support for the different tools though and we have users successfully deploying Dagster on all of them. All 3 have examples in https://github.com/dagster-io/dagster/tree/master/examples if you’d like to play around with any. The Docker Compose one will require the least additional legwork to get up and running. Re: Dagit, it really depends on what your load is going to be. If you’re a pretty small team using Dagster, then a single Dagit instance is probably sufficient. If you have lots of users interacting with Dagit, you might find it useful to horizontally scale it so that there are many Dagit instances. And one more thing I’ll mention is that we offer a managed product (Dagster Cloud) which dramatically reduces the footprint of what you need to deploy (we’ll handle Dagit, the Daemons, the database, migrations, etc.) If you’re at the stage of evaluating different deployment options, it might be something worth looking into: https://dagster.io/cloud
w
@jordan How much and what kind of pricing does Dagster Cloud have? And does it mean that all of our python code would be in the cloud? How do we code those? And what about versioning (we use BitBucket git currently)? Secondly, you mention horizontally scale Dagit. What does that mean? What my immediate need is for a single Dagit to be able to monitor/orchestrate multiple repositories across different servers. Is this even possible?
s
Hi Will! I can schedule a call to talk through some of these questions. You're code and your data will still reside in your infrastructure and you can integrate CI/CD into your bitbucket flows to deploy changes.
I'll dm you to schedule a call.