Hey, I'm having some trouble with my dagster k8s i...
# deployment-kubernetes
k
Hey, I'm having some trouble with my dagster k8s instance and preemptible nodes. The dagster rabbitmq goes down every so often when we lose a node and the node affinity for the pv doesn't match. Currently we have an issue where the pv expects a label of
us-east-1a
and all the nodes are labeled
us-east-1b
or
us-east-1c
. Do I need to unpack the included rabbitmq and manually modify the statefulset or is there an easy way to facilitate this change through the top level helm chart?
j
Hi Kirk, this is a surprise to me- us-east-1a is the default in the rabbit-mq helm chart?
k
I don't see anything specifically set there, so I'm thinking zone is assumed based off of the node it gets assigned to. I don't really see any way to configure it at all. The statefulset deployment has nothing really related to it past the PVC which just sets the access mode of
ReadWriteOnce
. I'm not quite sure how that passes config to the PV
j
Hmm- unfortunately I haven’t seen this before. Happy to look in to their helm chart a bit but at present all I have are workarounds • Internally we use a cloud-hosted redis outside of our kubernetes cluster • We have a
k8s_job_executor
that just landed that offers pod-level isolation without celery (and no rabbitmq). Currently in experimental, would love to get feedback on it
k
Oh really? I'd be very interested in the k8s_job_executor. Can I use that in conjunction with celery_k8s or do I need to specify that? My use case there is I have some current k8s cron jobs that I'd like to have dagster orchestrate
j
What would using it in conjuction with celery_k8s mean?
k
I might be getting it confused - my frame of reference was Airflow with it's K8s Job Operator, which allowed you to execute arbitrary k8s jobs. I'm looking for something similar in dagster. This is completely tangential.
j
Gotcha- yeah our terminology can be a little confusing when you go back and forth
k
Related to the k8s_job_executor - do I still set envSecrets and everything else the same for it? I assume it still launches a version of the same user code repository or a set image? If there's documentation that works too
j
Yep! Historically we’ve offered two k8s deployments: • dagster-k8s, where we run an entire pipeline inside a single k8s job • dagster-celery-k8s, where we use celery to run each solid of the pipelne within its own k8s job
We’re now adding a feature to dagster-k8s that optionally enables running each solid within its own k8s job, so that most users don’t have to bring in the overhead of celery unless there’s specific features they need
👍 1
k
Will that include pooling or some sort of access limit?
j
At present we only support run-level concurrency limits, one feature gap compared to celery which enables limits per solid
The new isolation mode in dagster-k8s is coming out though and we’re still working on getting it documented.
Re:
My use case there is I have some current k8s cron jobs that I’d like to have dagster orchestrate
Dagster has a native scheduler that can kick off pipeline runs on a cron schedule, and you can use whichever deployment (dagster-k8s, dagster-celery-k8s) to run the pipeline
n
PV topology stuff is not rleated to Dagster or Rabbit per se
It's a general AWS issue
k
Re:
Dagster has a native scheduler that can kick off pipeline runs on a cron schedule, and you can use whichever deployment (dagster-k8s, dagster-celery-k8s) to run the pipeline
I'm doing a poor job of explaining it, but basically I was hoping for some type of way to have dagster run an arbitrary k8s job. But the more I think about it, the more obvious it seems that I need some sort of k8s solid that would just contain a script to do something like
kubectl create job ...
Any idea where I should go then? I don't manage the cluster unfortunately, and I'm still somewhat new to all of this. Is there some default for how PVs get configured when you create a cluster?
n
Sorry, wasn't watching this thread, so okay
First question, EKS?
(or Kops, homebrew, something else)
The basic issue is probably that you have a single autoscaling group across multiple AZs which means the AZ for scale up is picked at random
K8s is aware of the volume topology, but cluster-autoscaler is much dumber
And it's just a hard limitation on EBS volumes that they can't span AZs
So the general recommendation is to make separate ASGs for each AZ you operate in
k
All good and yes it's EKS. I believe you're right on that. We use spot instances, and I think it's just whatever is available in the us-east region
n
and enable
balance-similar-node-groups=true
k
So even creating our own EBS volumes and associating them with the cluster wouldn't work right? Or is that fine but we might deal with some lag since it would be in a different AZ?
n
You literally can never use an EBS volume across AZs, the attach API only allows matching volume AZ and instance AZ
😞
(otherwise the failure of one AZ would impact something in another)
If you want cross-AZ volumes you can use EFS (aka hosted NFS)
But just be aware of the limitations the NFS brings to the party
k
It seems like if we're just using spot instances for our nodes that setting up ASGs would be overkill, as we'd basically just want dedicated nodes right? I'll look into EFS
n
No? Not sure what you mean
When running K8s on AWS, it's all in ASGs 🙂
That's just how cluster-autoscaler works
You just have some with a launchconfig set to use spot
The ASGs themselves don't do any scaling, they are just the only way in AWS to say "give me X instances that look like this"
k
Well I guess I mean since we just bid on whatever instance is available in a given region - it doesn't make sense for us to try and scale/add compute to a given AZ
n
Sure, if you want the broadest possible bidding pool then design wise EBS volumes are very limiting
Spot prices are uuuuusually similar across AZs in a single region but not always
k
AFAIK that's what we're working with. This is all being done on our dev/staging cluster so I think it's just the cheapest we can get it
At least every time our dagster app goes down we have brand new nodes that are in different AZs. I believe if we keep one in the same zone it's okay. Actually it should be since that's what we're discussing
n
On that I defer to you 🙂
k
Thanks for your help with this. This was massively helpful. I also think EFS will work for this case
n
The usual rule of NFS is "it will work but you'll still regret it eventually" 😉