Matt Menzenski
10/25/2022, 3:10 PMMark Fickett
10/25/2022, 4:56 PMJuan Arrivillaga
10/25/2022, 10:30 PMmetadata:
annotations:
<http://kubernetes.io/psp|kubernetes.io/psp>: eks.privileged
creationTimestamp: "2022-10-25T16:41:57Z"
generateName: dagster-step-6527edf0609f7ecea00021a01b2e8350-
labels:
<http://app.kubernetes.io/component|app.kubernetes.io/component>: step_worker
<http://app.kubernetes.io/instance|app.kubernetes.io/instance>: dagster
<http://app.kubernetes.io/name|app.kubernetes.io/name>: dagster
<http://app.kubernetes.io/part-of|app.kubernetes.io/part-of>: dagster
<http://app.kubernetes.io/version|app.kubernetes.io/version>: 1.0.14
controller-uid: e2588cee-1c16-44d9-ae70-cce7842a789d
dagster/job: ASSET_JOB
dagster/op: etl_finngen_gwas_sumstats.download_gwas_transform_upload_parque
dagster/run-id: ff7fff0b-ffe8-4990-b40a-c0d1631fb399
job-name: dagster-step-6527edf0609f7ecea00021a01b2e8350
name: dagster-step-6527edf0609f7ecea00021a01b2e8350-tzfc8
namespace: dagster
ownerReferences:
- apiVersion: batch/v1
blockOwnerDeletion: true
controller: true
kind: Job
name: dagster-step-6527edf0609f7ecea00021a01b2e8350
uid: e2588cee-1c16-44d9-ae70-cce7842a789d
resourceVersion: "204837656"
uid: 347329df-ee12-427b-b6eb-e1e68a3ca8a8
spec:
containers:
- args:
- dagster
- api
- execute_step
env:
- name: DAGSTER_EXECUTE_STEP_ARGS
value: '{"__class__": "ExecuteStepArgs", "instance_ref": {"__class__": "InstanceRef",
"compute_logs_data": {"__class__": "ConfigurableClassData", "class_name":
"NoOpComputeLogManager", "config_yaml": "{}\n", "module_name": "dagster.core.storage.noop_compute_log_manager"},
"custom_instance_class_data": null, "event_storage_data": {"__class__": "ConfigurableClassData",
"class_name": "PostgresEventLogStorage", "config_yaml": "postgres_db:\n db_name:
test\n hostname: dagster-postgresql\n params: {}\n password:\n env:
DAGSTER_PG_PASSWORD\n port: 5432\n username: test\n", "module_name": "dagster_postgres.event_log"},
"local_artifact_storage_data": {"__class__": "ConfigurableClassData", "class_name":
"LocalArtifactStorage", "config_yaml": "base_dir: /opt/dagster/dagster_home\n",
"module_name": "dagster.core.storage.root"}, "run_coordinator_data": {"__class__":
"ConfigurableClassData", "class_name": "QueuedRunCoordinator", "config_yaml":
"{}\n", "module_name": "dagster.core.run_coordinator"}, "run_launcher_data":
{"__class__": "ConfigurableClassData", "class_name": "K8sRunLauncher", "config_yaml":
"dagster_home: /opt/dagster/dagster_home\nimage_pull_policy: Always\ninstance_config_map:
dagster-instance\njob_namespace: dagster\nload_incluster_config: true\npostgres_password_secret:
dagster-postgresql-secret\nservice_account_name: dagster\n", "module_name":
"dagster_k8s"}, "run_storage_data": {"__class__": "ConfigurableClassData",
"class_name": "PostgresRunStorage", "config_yaml": "postgres_db:\n db_name:
test\n hostname: dagster-postgresql\n params: {}\n password:\n env:
DAGSTER_PG_PASSWORD\n port: 5432\n username: test\n", "module_name": "dagster_postgres.run_storage"},
Slackbot
10/25/2022, 10:36 PMMark Fickett
10/27/2022, 8:14 PMfahad
10/27/2022, 9:16 PMNoSuchKey
for the upstream asset.Mark Fickett
10/28/2022, 2:37 PMJuan Arrivillaga
10/28/2022, 6:23 PM"dagster-k8s/config"
tag using dagster.AssetsDefinition.from_graph
. I notice that this tag doesn't seem to get added to the run configuration. I want this asset materialization to run in a specific node group with specific resource requests. Will the job not pick up these tags?Daniel Galea
11/02/2022, 1:24 PMBrian Pohl
11/04/2022, 11:59 PMpr-560-dagster
, and it has a service account called pr-560-worker
. there is another namespace, pr-100-dagster
, with a similar service account.
i've configured my dagster job to use k8s_job_executor
. the code i have is something like this:
@job(
name='Market_Insights_ETL_and_Export',
executor_def=k8s_job_executor,
config={
...
'execution': {'config': {'job_namespace': 'pr-560-dagster', 'service_account_name': 'pr-560-dagster-worker'} }
}
...
)
and this shows up in Dagit as well (see first image).
but even with this config, the run gave me this error:
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"system:serviceaccount:pr-100-dagster:pr-100-dagster-worker\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"pr-560-dagster\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}
it seems to have acknowledged the namespace that i passed it, but it's ignoring my service account specification and picking one arbitrarily. in the second and third images, you can see two runs of the same job with the same config - it picks different namespaces for them. it seems to alternate back and forth between each namespace.
if it's important at all, the particular op that i'm trying to execute is also a kubernetes job, so i'm using k8s_job_op
to configure that. and yes, i've confirmed i'm passing pr-560-worker
as the service account for that as well.Eldan Hamdani
11/07/2022, 12:10 PM0.13.4
to 0.14.15
and I got dagster-daemon
pod in evicted status.
Do you know what can cause that issue?Matt Menzenski
11/07/2022, 4:21 PMSon Giang
11/09/2022, 8:52 AMBlagoja Stojkoski
11/09/2022, 6:01 PMBrian Pohl
11/18/2022, 6:10 PMStephen Bailey
11/21/2022, 4:41 PMdagster_k8s_executor
fall back to the default executor (multiprocess) when being run outside of an environmenet with K8SRunLauncher? I understandably get this error when trying to run locally.
dagster._core.errors.DagsterUnmetExecutorRequirementsError: This engine is only compatible with a K8sRunLauncher; configure the K8sRunLauncher on your instance to use it.
Eegan K
11/21/2022, 8:26 PMMark Fickett
11/29/2022, 3:23 PMMark Fickett
12/02/2022, 5:51 PMExecuting step "my_op" in Kubernetes job dagster-step-89ab8bb9534b72ded96243830e8567be.
, but the pod name I need for kubectl
is actually dagster-step-89ab8bb9534b72ded96243830e8567be-cz9ts
with an extra suffix. Is there a way to get that suffix in the Dagster log message? It would be handy to be able to use that pod name directly in debugging commands.Leo Xiong
12/10/2022, 2:06 AMttlSecondsAfterFinished
for the K8s jobs cluster wide? instead of having to annotate each @job
?Pang Wu
12/13/2022, 5:44 PMDefinitions reloaded!
keep poping up every ~20 seconds. Rolling back to 1.1.5 resolve the issue, any idea why?Jaap Langemeijer
12/14/2022, 1:33 PMexecutor_def
like when using a @job
annotation. I yield a RunRequest
, where should I put the executor definition? Is this documented somewhere?Mark Fickett
12/14/2022, 6:49 PM@op
cause a retry if the pod is OOMKilled? From the docs it looks like the retry is handling an exception from the op, and I'm not sure if that handling encapsulates the pod-termination check. I'm looking at an op that failed with "STEP_FAILURE Step my_step failed health check: Discovered failed Kubernetes job dagster-step-1b1e67c47da27410811b5606965fb1c4 for step my_step" and showed Status: OOMKilled in kubectl describe
but ran fine on retry w/ plenty of RAM headroom as far as I can tell.Mark Fickett
12/14/2022, 8:43 PMCaio Tavares
12/15/2022, 7:52 PMdagster-steps-*
running on a pool of nodes which has more resources available.Leo Xiong
12/20/2022, 5:34 AMjob_namespace
value and Dagster seems to be reading the config correctly, but the jobs are still being created in the same namespace as the daemon is deployed. am I missing something?
relevant config section.
run_launcher:
module: dagster_k8s
class: K8sRunLauncher
config:
job_namespace: dagster-jobs
Mark Fickett
12/22/2022, 4:43 PMEegan K
12/22/2022, 7:25 PMSundara Moorthy
01/02/2023, 9:10 PMDusty Shapiro
01/09/2023, 6:22 PM