https://dagster.io/ logo
Title
c

Charlie Bini

04/27/2022, 3:23 PM
helm noob question: if I want to adjust the agent resources, that goes under
dagsterCloudAgent.resources
on the helm chart, right? how exactly do I edit that and persist that setting across updates?
also will this setting apply to code locations as well, or is that elsewhere?
now for the second, I would think it would go under
workspace
but there isn't a
resources
property there
d

daniel

04/27/2022, 4:40 PM
Hey Charlie - we really should have that as an option under workspace, but we currently don't. We'll make sure that that gets prioritized, in the meantime this could work as a workaround (setting a default set of resources to the k8s cluster that gets applied when one isn't set): https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/
c

Charlie Bini

04/27/2022, 6:14 PM
thanks!
the workaround doesn't seem to work for me, anything look off?
extraManifests:
  - apiVersion: v1
    kind: LimitRange
    metadata:
      name: default-container-resources
      namespace: dagster-cloud
    spec:
      limits:
        - default:
            cpu: "250m"
            memory: "512Mi"
            ephemeral-storage: "10Mi"
          defaultRequest:
            cpu: "250m"
            memory: "512Mi"
            ephemeral-storage: "10Mi"
          type: Container
d

daniel

04/27/2022, 6:47 PM
Do you see the limitrange applied in your cluster? (with
kubectl get limitranges)
and its just not doing what you would expect?
c

Charlie Bini

04/27/2022, 7:16 PM
ok it's showing up on the cluster:
Name:       default-container-resources
Namespace:  dagster-cloud
Type        Resource           Min  Max  Default Request  Default Limit  Max Limit/Request Ratio
----        --------           ---  ---  ---------------  -------------  -----------------------
Container   memory             -    -    512Mi            512Mi          -
Container   cpu                -    -    250m             250m           -
Container   ephemeral-storage  -    -    10Mi             10Mi           -
but it doesn't affect the pod:
<http://autopilot.gke.io/resource-adjustment|autopilot.gke.io/resource-adjustment>: {"input":{"containers":[{"name":"dagster"}]},"output":{"containers":[{"limits":{"cpu":"500m","ephemeral-storage":"1Gi","memory":"2Gi"},"requests":{"cpu":"500m","ephemeral-storage":"1Gi","memory":"2Gi"},"name":"dagster"}]},"modified":true}
it DOES work for the Agent when I set the dagsterCloudAgent.resources however
the workspace pod's YAML specifies other limits as well:
resources:
          limits:
            cpu: 500m
            ephemeral-storage: 1Gi
            memory: 2Gi
          requests:
            cpu: 500m
            ephemeral-storage: 1Gi
            memory: 2Gi
wondering if this is an autopilot thing
don't think so actually, even if I increase the resource limit and request on the YAML, it still goes to the defaults above
d

daniel

04/27/2022, 8:29 PM
I'll see if I can get this working myself - there may be something about how the agent sets up the code locations where it's injecting somethign custom and ignoring the default
c

Charlie Bini

04/27/2022, 8:37 PM
cool thanks for looking into it
d

daniel

04/27/2022, 9:26 PM
@Charlie Bini just to make sure i'm checking the same thing - what are you running to fetch those limits from the workspace pod that you potsed earlier?
c

Charlie Bini

04/27/2022, 9:28 PM
kubectl describe limitranges <name>
You might need to add the namespace too if you're not using the default
d

daniel

04/27/2022, 9:30 PM
when you said earlier though that the "the workspace pod's YAML specifies other limits" - that was by running 'kubectl describe pod <pod name>'?
c

Charlie Bini

04/27/2022, 9:31 PM
Ah that I just copied from the UI
On GKE it’s a tab under the workload for the service
d

daniel

04/27/2022, 9:43 PM
I think there may be at least some GCP stuff going on here, because with the default helm chart running in our dogfooding EKS cluster, I don't see any resource limits being applied to either the agent pod or to the code location pods
And I've seen some other users of open-source dagster reporting Autopilot doing some strange things wrt resource requests
we should be able to add this to cloud pretty quickly though
c

Charlie Bini

04/28/2022, 12:02 AM
Cool, not super urgent but ultimately a feature I'd like to take advantage of. Makes big difference in pricing of I can rightsize those pods
d

daniel

04/28/2022, 2:56 PM
We were able to get this into the release going out later today actually - here's the relevant changelog entry: • When using the Kubernetes agent, you can now supply a
resources
key under
workspace
that will apply resource requirements to any pods launched by the agent. For example:
workspace:
  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 100m
      memory: 128Mi
You can also vary the resource requirements for pods from different code locations by setting config in the Workspace tab. For example:
location_name: test-location
image: dagster/dagster-cloud-template:latest
code_source:
  package_name: dagster_cloud_template
container_context:
  k8s:
    resources:
      limits:
        cpu: 100m
        memory: 128Mi
      requests:
        cpu: 100m
        memory: 128Mi
c

Charlie Bini

04/28/2022, 3:22 PM
very cool
so the 1st example is on the helm chart and affects the code location pod, and the 2nd is on the locations yaml and affects the job run pods?
d

daniel

04/28/2022, 3:26 PM
both affect gRPC servers and launched jobs (all pods launched by the agent, basically). The first is the way to set it for all code locations, the second is the way to vary it for individual code locations
c

Charlie Bini

04/28/2022, 3:27 PM
even better!