Folks, the helm chart renders the runLauncher ```r...
# deployment-kubernetes
v
Folks, the helm chart renders the runLauncher
Copy code
run_launcher:      
  module: dagster_celery_k8s
  class: CeleryK8sRunLauncher
  config:
    dagster_home:
      env: DAGSTER_HOME
    instance_config_map:
      env: DAGSTER_K8S_INSTANCE_CONFIG_MAP
    postgres_password_secret:
      env: DAGSTER_K8S_PG_PASSWORD_SECRET
    broker: "<pyamqp://test:test@dagster-rabbitmq:5672//>"
    backend: "rpc://"
So it sets up the celery worker container ... and it has the DAGSTER_K8S_PG_PASSWORD_SECRET and the DAGSTER_K8S_INSTANCE_CONFIG_MAP ... When we launch a pipeline the run container complains about both
Copy code
`

    Error 2: Post processing at path root:postgres_password_secret of original value {'env': 'DAGSTER_K8S_PG_PASSWORD_SECRET'} failed:
dagster.config.errors.PostProcessingError: You have attempted to fetch the environment variable "DAGSTER_K8S_PG_PASSWORD_SECRET" which is not set. In order for this execution to succeed it must be set in this environment.
This is really confusing , we are not setting up the environment of the run containers but we do not even know where to do that...
And we do not see where in the helm chart this has to be set up
j
Hi @Vishal Santoshi this is an odd bit of config that should certainly be improved. Could you look at the result of
Copy code
kubectl get configmaps
it will have a result
<name>-pipeline-env
, defaulting to
dagster-pipeline-env
but its overridable. Then use that configmap in your pipeline run config (entered either in code via a PresetDefinition, or in the Dagit playground)
Copy code
execution:
  celery-k8s:
    config:
      env_config_maps:
        - "<NAME>-pipeline-env"
This configmap contains the env vars that your container is missing.
v
This did work and thank you . I am trying to make a mental model here of how this works and this will help solving the next issue. The steps fail and it seems it is not able to push logs ( at the end of the step ) to S3. I have set up
Copy code
compute_logs:      
  module: dagster_aws.s3.compute_log_manager
  class: S3ComputeLogManager
  config:
    bucket: "xxxxxx-dev-null"
    prefix: "dagster-test-"
And it fails with S3 credentials issue
Copy code
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I though want the run container to execute via an IAM role that allows for writes to the said S3 bucket.
Do I now set up the role through the run-config ( configuring the executors is the right lingo I think ) too and if yes how ? Something like
Copy code
annotations = {
        "<http://iam.amazonaws.com/role|iam.amazonaws.com/role>" = <aws_iam_role.dagster_poc.name>
I do not see any complete example of setting up execute containers with the right configurations, maps, annotations etc ... That said the celery workers ( that presumably launch these containers ) have been set up with the required annotation that allows for S3 access to the said compute log bucket and should arguably be propagating their set up to the containers they launch.
Copy code
Annotations:  <http://iam.amazonaws.com/role|iam.amazonaws.com/role>: dagster-poc-yyyyyyyyyyyy
In fact we need this role to be available to all the run steps ( access to different resources of our stack ) and thus to all pods executed from a celery worker. This issues is not restricted to just the compute logs I would assume.
yep, similar issue when io manager is set to
s3_pickle_io_manager
Copy code
botocore.exceptions.NoCredentialsError: Unable to locate credentials
  File "/usr/local/lib/python3.8/site-packages/dagster/core/errors.py", line 184, in user_code_error_boundary
    yield
  File "/usr/local/lib/python3.8/site-packages/dagster/core/execution/resources_init.py", line 289, in single_resource_event_generator
    resource_def.resource_fn(context)
  File "/usr/local/lib/python3.8/site-packages/dagster_aws/s3/io_manager.py", line 114, in s3_pickle_io_manager
    pickled_io_manager = PickledObjectS3IOManager(s3_bucket, s3_session, s3_prefix=s3_prefix)
  File "/usr/local/lib/python3.8/site-packages/dagster_aws/s3/io_manager.py", line 17, in __init__
    self.s3.head_bucket(Bucket=self.bucket)
Want this pods to launch under an IAM role that allows access to configured buckets....
So I tried this set up
Copy code
@solid(
tags = {
    'dagster-k8s/config': {
      'container_config': {
        'resources': {
          'requests': { 'cpu': '250m', 'memory': '64Mi' },
          'limits': { 'cpu': '500m', 'memory': '2560Mi' },
        },
       },
       'pod_template_spec_metadata': {
           'annotations': { "<http://iam.amazonaws.com/role|iam.amazonaws.com/role>": "dagster-poc-20211014204833791300000001"}
       },
    },
  },
)
def not_much():
    return
And it did get the annotation on the run pod
Copy code
Annotations:  <http://iam.amazonaws.com/role|iam.amazonaws.com/role>: dagster-poc-20211014204833791300000001
              <http://kubernetes.io/psp|kubernetes.io/psp>: eks.privileged
It still complains about missing
Copy code
botocore.exceptions.NoCredentialsError: Unable to locate credentials

Stack Trace:
  File "/usr/local/lib/python3.8/site-packages/dagster/core/errors.py", line 184, in user_code_error_boundary
    yield
  File "/usr/local/lib/python3.8/site-packages/dagster/core/execution/resources_init.py", line 289, in single_resource_event_generator
    resource_def.resource_fn(context)
  File "/usr/local/lib/python3.8/site-packages/dagster_aws/s3/io_manager.py", line 114, in s3_pickle_io_manager
    pickled_io_manager = PickledObjectS3IOManager(s3_bucket, s3_session, s3_prefix=s3_prefix)
  File "/usr/local/lib/python3.8/site-packages/dagster_aws/s3/io_manager.py", line 17, in __init__
That actually makes sense .. boto3
Copy code
If you are running on Amazon EC2 and no credentials have been found by any of the providers above, Boto3 will try to load credentials from the instance metadata service. In order to take advantage of this feature, you must have specified an IAM role to use when you launched your EC2 instance.
I think I am missing where to specify that role within dagster set up.....
j
One option is to create a secret in your cluster with the
AWS_ACCESS_KEY_ID
etc. variables, then use
env_secrets
in run launcher config (or executor config, if it differs per run).
Copy code
env_secrets (Optional[List[str]]): A list of custom Secret names from which to
            draw environment variables (using ``envFrom``) for the Job. Default: ``[]``. See:
            <https://kubernetes.io/docs/tasks/inject-data-application/distribute-credentials-secure/#configure-all-key-value-pairs-in-a-secret-as-container-environment-variables>
I’m assuming you’re on EKS? If so there are a few other options. In our clusters, we use iam roles for service accounts https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
@Dagster Bot docs K8s AWS auth and roles
d
j
@Dagster Bot docs K8s Instance configmap
d
v
One option is to create a secret in your cluster with the 
AWS_ACCESS_KEY_ID
 etc. variables, then use 
env_secrets
 in run launcher config (or executor config, if it differs per run).
So I tried this approach. Does this work for celery workers based launches ? I see the env for the celery workers having the
AWS_ACCESS_KEY_ID
set up as in
kubectl -n dagster-poc exec -it dagster-celery-workers-redis-7bb85b84fb-h2cnd -- env | grep ACCE
returns me the desired values ( here redis is the celery queue ) . but still the exception persists ...
j
When using the celery-k8s executor, the execution of your solids is taking place in ephemeral k8s jobs. I’m actually surprised that those env vars are appearing on the k8s workers- I think that’s being populated by somewhere else, rather than the launcher config.
I misspoke earlier- for celery-k8s, this should go in run config (in the dagit playground, or in code via a PresetDefinition) rather than in the run launcher config.
v
They definitely there, but I see you point. I just did this
Copy code
`runLauncher = {
      type = "CeleryK8sRunLauncher"
      config = {
        celeryK8sRunLauncher = {
          image = {
            repository = "<http://docker.io/dagster/dagster-celery-k8s|docker.io/dagster/dagster-celery-k8s>"
            tag        = "0.12.2"
            pullPolicy = "IfNotPresent"
          }
          envSecrets = [
            {
              name = local.dagster_poc_external_secrets.resources[0].metadata.name
            },
          ]
          annotations = {
            "<http://iam.amazonaws.com/role|iam.amazonaws.com/role>" = aws_iam_role.dagster_poc.name
          }

          workerQueues = [
            {
              name         = "dagster"
              replicaCount = 1
            },
            {
              name         = "redis"
              replicaCount = 1
            },
          ]
        }
      }
    }
I will remove the reference and see if they disappear ...
j
Ok. And when you get to debugging why your solid compute is missing the credentials: if adding it to your run config doesn’t work, could you run
kubectl describe
on one of the step jobs? They have the name
dagster-job-…
. You could confirm that way that the secret isn’t getting loaded on the container
But, I expect that adding the env secret names in run config will get things working
v
The run config secrets, do you gave an example handy ?
Copy code
@solid(
tags = {
    'dagster-celery/queue': 'redis',
    'dagster-k8s/config': {
      'container_config': {
        'resources': {
          'requests': { 'cpu': '250m', 'memory': '64Mi' },
          'limits': { 'cpu': '500m', 'memory': '2560Mi' },
        },
       },
       'pod_template_spec_metadata': {
           'annotations': { "<http://iam.amazonaws.com/role|iam.amazonaws.com/role>": "dagster-poc-20211014204833791300000001"}
       },
    },
  },
)
def not_much():
    return
like here right ? ... Any example ?
j
Run config is separate from what’s specified in tags. It can be defaulted in code, and overridden at launch time. Here are general docs about how it can be provided, and I’ll find an example of configuring the celery-k8s executor (which you may already have) https://docs.dagster.io/concepts/configuration/config-schema#providing-run-configuration
v
I’m actually surprised that those env vars are appearing on the k8s workers-
SoI went from
Copy code
"runLauncher":
                  "config":
                    "celeryK8sRunLauncher":
                      "annotations":
                        "<http://iam.amazonaws.com/role|iam.amazonaws.com/role>": "dagster-poc-20211014204833791800000002"
                      "envSecrets":
                      - "name": "dagster-poc-secrets"
                      "image":
                        "pullPolicy": "IfNotPresent"
                        "repository": "<http://docker.io/dagster/dagster-celery-k8s|docker.io/dagster/dagster-celery-k8s>"
                        "tag": "0.12.2"
                      "workerQueues":
                      - "name": "dagster"
                        "replicaCount": 1
                      - "name": "redis"
                        "replicaCount": 1
                  "type": "CeleryK8sRunLauncher"
to
Copy code
"runLauncher":
                  "config":
                    "celeryK8sRunLauncher":
                      "annotations":
                        "<http://iam.amazonaws.com/role|iam.amazonaws.com/role>": "dagster-poc-20211014204833791800000002"
                      "image":
                        "pullPolicy": "IfNotPresent"
                        "repository": "<http://docker.io/dagster/dagster-celery-k8s|docker.io/dagster/dagster-celery-k8s>"
                        "tag": "0.12.2"
                      "workerQueues":
                      - "name": "dagster"
                        "replicaCount": 1
                      - "name": "redis"
                        "replicaCount": 1
                  "type": "CeleryK8sRunLauncher"
j
Run config for the celery-k8s executor looks like
Copy code
execution:
  celery-k8s:
    config:
      job_image: '<http://my_repo.com/image_name:latest|my_repo.com/image_name:latest>'
      job_namespace: 'some-namespace'
      broker: '<pyamqp://guest@localhost//>'  # Optional[str]: The URL of the Celery broker
      backend: 'rpc://' # Optional[str]: The URL of the Celery results backend
      include: ['my_module'] # Optional[List[str]]: Modules every worker should import
      ...
      config_source: # Dict[str, Any]: Any additional parameters to pass to the
          #...       # Celery workers. This dict will be passed as the `config_source`
          #...       # argument of celery.Celery().
https://docs.dagster.io/_apidocs/libraries/dagster-celery-k8s#dagster_celery_k8s.celery_k8s_job_executor So you could add
env_secrets
to this config
v
as in removed the.
envSecrets
and there are no
Copy code
kubectl -n dagster-poc exec -it dagster-celery-workers-redis-697f8bb695-jtfg9  -- env | grep ACCE
j
You’re right! I wasn’t aware that the celery deployment was that fancy, sorry to send you down the wrong rabbit hole there
v
no issues. but that suggests that we could always pull in the env from there rather then the run-config route.. that said testing this
Copy code
resources:
  io_manager:
    config:
      s3_bucket: "loom-dev-null"
      s3_prefix: "dagster-poc-"
execution:
  celery-k8s:
    config:
      job_namespace: 'dagster-poc'
      env_config_maps:
        - "dagster-pipeline-env"
      env_secrets:
        - "dagster-poc-secrets"
j
Agreed
v
Yep, awesome , went further... it got the secrets , of course now a permissions issues . This does point to the SA and Role set up.
Copy code
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadBucket operation: Forbidden
👍 1