We are trying to execute a k8s_job_op (v1.0.8) wh...
# dagster-plus
d
We are trying to execute a k8s_job_op (v1.0.8) where: - the main k8s pod has a service account with access to pod status - The pod that we generate from the primary has another service account with access to other resources but not to the status pod. Is this operation possible? Exception:
Copy code
Exception: No pod names in job after it started

Stack Trace:
  File "/usr/python/__pypackages__/3.9/lib/dagster/_core/execution/plan/utils.py", line 47, in solid_execution_error_boundary
    yield
  File "/usr/python/__pypackages__/3.9/lib/dagster/_utils/__init__.py", line 430, in iterate_with_context
    next_output = next(iterator)
  File "/usr/python/__pypackages__/3.9/lib/dagster/_core/execution/plan/compute_generator.py", line 73, in _coerce_solid_compute_fn_to_iterator
    result = fn(context, **kwargs) if context_arg_provided else fn(**kwargs)
  File "/usr/python/__pypackages__/3.9/lib/dagster/_annotations.py", line 108, in inner
    return target(*args, **kwargs)
  File "/usr/python/__pypackages__/3.9/lib/dagster_k8s/ops/k8s_job_op.py", line 220, in k8s_job_op
    raise Exception("No pod names in job after it started")
Code:
Copy code
from dagster import job
from dagster_k8s import k8s_job_op

my_k8s_op_1 = k8s_job_op.configured(
        {
            "image": "myrepo/test:test",
            "command": ["run_job.sh"],
            "args": [],
            "env_vars": ["ENVIRONMENT=stage"],
            "resources": {
                "requests": {
                    "memory": "512Mi",
                    "cpu": "500m"
                },
                "limits": {
                    "memory": "2048M",
                    "cpu": "1000m"
                }
            },
            "pod_spec_config": {"service_account_name": "my-custom-access"},
            "service_account_name": "my-dagster-cloud-agent",
        },
        name="my_test_job",
    )

@job()
def my_k8s_job_1():
    my_k8s_op_1()
d
Hi David - that sounds like it should work. It looks like the part that's failing is when it tries to pull the pod names from the kubernetes job that it create - so you might need to give the main service account access to the jobs/status permission as well if it doesn't have it?
Here are the permissions we grant in our helm chart to Dagster pods in the role:
Copy code
# Allow the Dagster service account to read and write Kubernetes jobs and pods.
rules:
  - apiGroups: ["batch"]
    resources: ["jobs", "jobs/status"]
    verbs: ["*"]
  # The empty arg "" corresponds to the core API group
  - apiGroups: [""]
    resources: ["pods", "pods/log", "pods/status"]
    verbs: ["*"]
d
thanks daniel!
condagster 1