https://dagster.io/ logo
#ask-community
Title
# ask-community
d

Daniel Gafni

07/19/2022, 2:00 PM
Hey guys! I'm trying to run a job in Kubernetes. I need to assign a toleration to the op. I'm currently doing it with the attached code. However, no tolerations appear in the launched pods (and they don't get scheduled on any of the nodes in my case 😞 ) Seems like the tolerations are just ignored. Edit: yes, after inspecting the source code of
dagster-k8s
it seems like they are ignored
fyi @sandy I'm making a PR to add tolerations support
d

daniel

07/19/2022, 3:37 PM
Hi Daniel - I believe what you're doing here should work, but applying tags to ops will only work if you're using the k8s_job_executor to run each op in its own kubernetes pod. What you probably want here is for the tags to go on the job/run that's produced. I'm checking with the team about the best way to do that
I do like the spirit of that PR as well, although I'd probably want to get pod_spec_config / container_config / etc. on the K8sContainerContext object rather than just tolerations, so that the full range of k8s config is available. I don't think you should need it for what you want to do here though
I think what you may want here is to use
define_asset_job
and put the tags on the tags argument to that function, but i'm confirming that that's correct with the team
s

sandy

07/19/2022, 3:43 PM
do you want these tags to apply to all the assets in your repository?
d

Daniel Gafni

07/19/2022, 3:53 PM
I am using k8s_executor, right I want to have the ability to specify custom tags to different ops / assets For example, in my case GPU-based assets (models) require special tolerations
d

daniel

07/19/2022, 3:54 PM
When you say the k8s_executor do you mean the
k8s_job_executor
? Where you are configuring that in your code?
d

Daniel Gafni

07/19/2022, 5:13 PM
I'm configuring this in the repo as the default executor. Yes, I mean k8s_job_executor
d

daniel

07/19/2022, 5:14 PM
Got it - in theory they should be included, along with any other pod_spec_config, here: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-k8s/dagster_k8s/job.py#L682
oh wait! it looks like pod_spec_config is a sibling of dagster-k8s/config, it should be a child 🙂
Copy code
@asset(
    op_tags={
        "dagster-k8s/config": {
            "container_config": {
                "resources": {
                    "requests": {"memory": "64Gi"},
                    "limits": {"memory": "96Gi"},
                }
            },
            "pod_spec_config": {
              "tolerations": [
                  k8s.V1Toleration(key="node-role.sbermarket.tech/ml", operator="Equal", value="", effect="NoSchedule").to_dict()
              ]
          },
        },
    },
)
d

Daniel Gafni

07/19/2022, 5:16 PM
Damn, I'm sorry I wasted your time lol
d

daniel

07/19/2022, 5:17 PM
no worries, we'd like to move to a better way of specifying this config than nested tags
d

Daniel Gafni

07/19/2022, 5:17 PM
Will try again when I get home
Yeah, using K8s python objects would help I guess
ok this works, thanks!
Another question.. is it possible to override the
image
used for the op? Setting
Copy code
op_tags={
        "dagster-k8s/config": {
            "container_config": {
                "image": "my-image",
}}}
doesn't do anything. The custom images has identical code but supports
cuda
unlike my default image.
d

daniel

07/19/2022, 7:06 PM
d

Daniel Gafni

07/19/2022, 7:12 PM
I'll try again a bit later
ok nice, it's working
🎉 1
3 Views