Hey y'all. Been running dagster in a single GKE (G...
# deployment-kubernetes
j
Hey y'all. Been running dagster in a single GKE (Google Kubernetes Engine) cluster managed by their "Autopilot" service. This is for a small business that can run their staging and production in the same cluster, but in separate namespaces. The business owner and I have been reviewing costs and we see there's a lot of excess cpu and memory for dagster resources. There's a "cost optimization" feature in GKE that shows what each workload has requested, and how much they are using. Autopilot clusters have a minimum 0.5 vCPU and 2GiB request for containers in a pod. So since there is the dagit, dagster daemon, and user code workloads each of these are consuming at least this much even though they're barely being used. Autopilot automatically scales up and down to control costs, but only if you're above this minimum threshold. I'm wondering if you have any suggestions for how I can get the 3 dagster related resources to share resources better. Here's a screenshot of the cost utilization, I deleted the product name section of the workloads for his privacy and put a red square around the dagster related workloads.
j
hey @Jayme Edwards have you looked at the values.yml for the dagster helm chart https://artifacthub.io/packages/helm/dagster/dagster?modal=values there are some settings there that should allow for tuning the mem/cpu of these things
j
Hey there. Yes absolutely. The problem is I can't tune them down below the minimum that GKE autopilot assigns a workload. So I'm trying to figure out if I can get like dagit and the daemon to run on the same pod or something.
j
Ah gotcha, I’m not aware of anyone having done that in the past but it might be possible with some fairly major modifications of the chart
p
I run dagster in GKE Autopilot at very low cost. Minimum requests are actually 0.250 vCPU and 0.5GiB. Things you can do to cut down costs: • run multiple containers in the same pod • have your dagit + daemon containers contain your code so you don’t need a code location server • use spot pods • use the k8s launcher to request resources on a per-job basis • use the k8s executor to request resources on a per-step basis