Hi, I created a tag for customizing the resources ...
# deployment-kubernetes
e
Hi, I created a tag for customizing the resources when a new pod spins up:
Copy code
tags={
        'dagster-k8s/config': {
            'container_config': {
                'resources': {
                    'requests': {'cpu': '4000m', 'memory': '19073Mi'},
                    'limits': {'cpu': '6000m', 'memory': '23841Mi'},
                }
            },
but I see that the pod which spins up failed after 4 min due to
OOMKilled
and if I describe the pod I see this resources config:
Copy code
Limits:
      cpu:     500m
      memory:  2560Mi
    Requests:
      cpu:     250m
      memory:  64Mi
do you know why it doesn’t work as expected?
????
d
(Hi, re: your followup comment, to set expectations about response latency - most of the core team is usually working during business hours PST-EST. We'll try to get to most questions within 24 hours, often but not always sooner) Did you set the tags on the Dagster job or on a Dagster op? If you're using the K8sRunLauncher, the tags would need to be on the Dagster job
e
it’s on ops
what do you mean on dagster job?
d
the example here includes tags on a Dagster op and tags on a Dagster job: https://docs.dagster.io/deployment/guides/kubernetes/customizing-your-deployment#job-or-op-kubernetes-configuration Putting tags on an op will only change things if you're using something that runs each op in its own Kubernetes pod, like the k8s_job_executor. If you want the resource limits to apply to the whole run, it needs to go on the Dagster job like in the @job decorator in that example in the docs
1
e
ohh ok
so the resources under job() is necessary
so why do I need the resources on the ops
?
d
you don't need them on the ops unless you're using the k8s_job_executor. we'll update the docs there to be more clear
e
so on my case I need just on the job?
d
Exactly yeah
e
I need to move the same tag to job?
d
just like in the example I posted, yeah
Copy code
@job(
  tags = {
    'dagster-k8s/config': {
      'container_config': {
        'resources': {
          'requests': { 'cpu': '200m', 'memory': '32Mi' },
        }
      },
    }
  }
)
e
but there is no limits
how does it make sense
d
You can put limits there too - put exactly the tags you were using before, but on the job instead of the op
e
ok got it
thank you!
condagster 1
ok it doesn’t work for job as well… @daniel
d
Is the tag being applied to the run?
e
what tag?
d
The dagster-k8s/config tag that you applied to the dagster job
e
it’s working cause I already have under the tag
nodeAffinity
which already works
but the resources section still not
d
Can you check that the tag isn't somehow getting overridden in the launchpad? I've seen an issue once in a while that we are working on fixing where I needed to edit the tags for the run in the launchpad to get a new tag to stick
I had to go in and edit the tags (to remove the previous value) using this dialog to get my new tag change to actually be applied
o
(from the same team as Eldan) is there a way to view the full tags assigned to a job? we clearly have some tags from the definition but cant see the complete values
d
That dialog I posted just above has them (from the "Add tags"/"Edit tags" button in the launchpad) - there's a "Tags from definition" section and a custom tags section
and then previously launched runs should have all the tags listed in the runs list
o
specifically k8s tags dont show up well and cant be expanded, so i cant be sure whats there
d
yeah, that's good feedback that i'll pass along. It does look like your resource limits are being applied though, so I would expect them to be applied to the kubernetes pod for the Dagster job
e
@daniel ok after many changes now I can change the resources, but the job is still failed with status
OOMKilled
after a few mins. the describe shows:
Copy code
Warning  FailedMount  44s (x2 over 44s)  kubelet            MountVolume.SetUp failed for volume "dagster-instance" : object "default"/"dagster-instance" not registered
  Warning  FailedMount  44s (x2 over 44s)  kubelet            MountVolume.SetUp failed for volume "kube-api-access-6jc7c" : object "default"/"kube-root-ca.crt" not registered
can you guide me what can be the issue here?
d
I have a lot more expertise with the dagster-specific parts of the equation here (i.e. debugging why config isn't being set on the resulting k8s resources when it's set in dagster) than I do with how k8s responds to the pod configuration once it's set the way you expect