I m trying to run through the Walkthrough re getting dagster dagster #deployment-kubernetes

I’m trying to run through the Walkthrough re: gett...

Matt Callaway

05/28/2021, 12:44 PM

I’m trying to run through the Walkthrough re: getting dagster deployed via helm to k8s. https://docs.dagster.io/deployment/guides/kubernetes/deploying-with-helm I’m running k8s on Docker for Mac. Everything appears to be working, but Dagit shows the workflow with no configuration. I follow my nose to let it write a default config and it comes up with this:

Copy code

resources:
  io_manager:
    config:
      s3_bucket: ""
solids:
  multiply_the_word:
    config:
      factor: 0
    inputs:
      word: ""

I launch the execution and it fails saying “” is an illegal bucket name. I fill in “test-bucket” and it fails “Unable to locate credentials”. This is of course local to my mac, so there shouldn’t be any credentials or any attempt to touch s3. It’s a walkthrough. Why would the walk through expect to use a real s3 bucket? Shouldn’t a demo expect to use the

fs_io_manager

? Can someone provide guidance on: • How to get the walkthrough to work? • How to specify the difference between a local dev/demo instance and a “real” instance on AWS? • How to deploy changes to config and code in a k8s environment?

johann

05/28/2021, 12:57 PM

Hi Matt, thank you for sharing your experience getting set up. I think your concerns point to gaps in documentation and maybe in some features, but I do want to give some framing: Generally haven’t built the system/written docs with the expectation that users do local testing using a kubernetes cluster. If you start Dagit as just a regular process on your machine, it of course works to use the

fs_io_manager

. Generally the helm/kubernetes part of the system serve as an option for users to productionize their dagster deployment.

Matt Callaway

05/28/2021, 1:03 PM

That makes sense, but I’d say that k8s is intended for “cloud native” infrastructure, meaning “It should run in any kubernetes deployment”, which could include k8s on mac, or GCP, or certainly “don’t expect AWS”.

Matt Callaway

05/28/2021, 1:04 PM

My most important initial goal with dagster is “run the same workflow on my mac as I would in the cloud” and to that end I’m trying to understand the “environment” or “infrastructural” set up parts as soon as I can. Any guidance would be really appreciated.

johann

05/28/2021, 1:09 PM

As you point out, there definitely can be utility in running kubernetes locally. For io_managers, there are a few options: • If you are using the standard k8s deployment (non-celery) then by default the full pipeline run takes place within one pod in a single process. In this situation,

fs_io_manager

mem_io_manager

work because there’s no isolation. If you use the multiprocess executor, then they’ll still be within a pod with a shared file system so

fs_io_manager

works • If you’re using an executor that isolates each solid to its own pod (such as

celery_k8s_job_executor

or the new

k8s_job_executor

), then

fs_io_manager

won’t work because pods generally won’t have access to a shared file system. You can set up a shared volume, but that’s generally not best practice/not recommended for production.

Matt Callaway

05/28/2021, 1:12 PM

I’m using the standard k8s deployment. So `fs_io_manager`or

mem_io_manager

should work. How do I make use of them? If I update the config and change

Copy code

resources:
  io_manager:
    config:
      s3_bucket: "test-bucket"

to something like:

Copy code

resources:
  io_manager:
    config:
      fs_io_manager:
         ...

it shows a warning that it expects s3_bucket… as if fs_io_manager isn’t present as an option. How do I make it use one of these other IO managers?

johann

05/28/2021, 1:12 PM

“It should run in any kubernetes deployment”, which could include k8s on mac, or GCP

Agreed. We have io_managers for gcs and s3. For local k8s, some users use minio

johann

05/28/2021, 1:35 PM

How do I make it use one of these other IO managers?

This is confusing- the io_managers are defined a resources, which are selected using pipeline mode. In the dagit playground there’s a mode dropdown above where you were writing run config.

johann

05/28/2021, 1:37 PM

The pipeline definition is here https://github.com/dagster-io/dagster/blob/master/examples/deploy_k8s/example_project/example_repo/repo.py#L28-L59 The

default

mode sets the io_manager to s3. As you’ve pointed out this isn’t ideal, it’d be great if you could file a gh issue to fix that.

test

mode leaves the io_manager to the system default,

mem_io_manager

Matt Callaway

05/28/2021, 1:46 PM

Ok yes, switching to “test” mode allows the pipeline to run. Oddly, I launch the run and it shows Success, but it looks like it’s still “doing something”. The run says it took 0.241s but the “timeline” view is still in motion as if it’s still doing something.

johann

05/28/2021, 1:49 PM

Sounds like a bug, could go ahead and file that as well.

Matt Callaway

05/28/2021, 1:49 PM

Not really sure what that was about. I navigated away from that view and launched a fresh run and it looks right. Finished with Success in 0.200s. It seems to succeed, though the steps don’t turn green, and I’m not seeing it log any output.

Matt Callaway

05/28/2021, 1:50 PM

Anyway, seeing how the “test” mode selects IO manager is helpful. Thanks for that.

Matt Callaway

05/28/2021, 1:50 PM

I’ll explore minio as well.

johann

05/28/2021, 1:54 PM

My most important initial goal with dagster is “run the same workflow on my mac as I would in the cloud” and to that end I’m trying to understand the “environment” or “infrastructural” set up parts as soon as I can

Overall our approach here is to give you knobs on each part of the system that interacts with environment/infra, so that you can choose the simple approach (e.g. mem_io_manager here) when possible and use the more complicated one when necessary

Matt Callaway

05/28/2021, 1:56 PM

That’s an attractive approach. It seems apparent to me already, in these early stages, that it’s easy for me to do easy things (I have a pipeline running in a local

pip installed

dagit already), and my imagination sees where to go in moving that easy thing into a more complex “real” infrastructure. The difficulty is in finding examples that help me get there. Having a “cookbook” of examples would be really helpful.

johann

05/28/2021, 2:04 PM

Gotcha.!

How to specify the difference between a local dev/demo instance and a “real” instance on AWS?

We do have a large set of knobs that have to be turned. The two main ones to consider are the instance (

dagster.yaml

) which controls system-wide settings and presets/modes which control individual pipelines. The defaults for the instance are good for local development, and if you’re using helm we set up a production dagster.yaml for you (it uses postgres for storage, kubernetes run launcher, etc). Pipelines need multiple presets and modes so they can have easy local execution (inprocess or multiprocess executor, mem or fs storage), plus whatever other options you need for production.

👍 1

johann

05/28/2021, 2:08 PM

How to deploy changes to config and code in a k8s environment?

Sorry if this wasn’t what you were asking-

helm upgrade

is how you can deploy new changes. When you’re working on dagster pipeline code, you’ll want a deploy process that builds a new image (with a new tag) and `helm upgrade`’s with the new image tag. Dagit and other parts of the system don’t need to change when you update your pipeline code, you only need to change the image for your user-deployments.

Matt Callaway

05/28/2021, 2:18 PM

I installed minio and have it listening. So I wanted to set the aws credentials for my now running k8s install. I updated

values.yaml

to make changes:

Copy code

env:
        AWS_ACCESS_KEY_ID: minioadmin
        AWS_SECRET_ACCESS_KEY: minioadmin

And ran the helm upgrade. I’m now switching back from “test” to “default”, and launching a run. It shows failure, but the logs include no errors.

Matt Callaway

05/28/2021, 2:19 PM

So the dagster.yaml lives in the container or is that within the values.yaml?

johann

05/28/2021, 2:24 PM

It shows failure, but the logs include no errors.

This is really strange. Would it be possible for you to send the logs from that

dagster-run-…

job?

johann

05/28/2021, 2:26 PM

So the dagster.yaml lives in the container or is that within the values.yaml?

In the helm case, we generate it based on your values.yaml. https://github.com/dagster-io/dagster/blob/master/helm/dagster/templates/configmap-instance.yaml

👍 1

Matt Callaway

05/28/2021, 2:30 PM

Looking for logs

johann

05/28/2021, 2:36 PM

It seems like you’re running in to a couple strange behaviors with the event log. I haven’t tried running k8s in docker on mac, it might be possible that you’re hitting strange behavior with dagit websockets or something like that

johann

05/28/2021, 2:39 PM

A diagnostic that could be helpful is

dagster debug export <run ID> output_file.gzip

(as an exec to the dagit pod). That will include the raw events from the database, if you share it with me I could check if we’re also missing the events

johann

05/28/2021, 2:40 PM

(

kubectl exec

and

kubectl cp

are useful here, lmk if you need any help)

Matt Callaway

05/28/2021, 2:42 PM

kubectl get pods

shows me a set of 14 pods named

dagster-run-…

Iterating over them with

kubectl logs $POD

I see a few different sorts of error. This one I was expecting:

Copy code

botocore.exceptions.NoCredentialsError: Unable to locate credentials

as I’m trying to supply creds for my new minio setup. But then also there’s

Copy code

2021-05-28 14:17:12 - dagster - ERROR - example_pipe - 06d59190-0d41-4523-b2aa-f03b86ac185e - 1 - PIPELINE_FAILURE - Execution of pipeline "example_pipe" failed. An exception was thrown during execution.

dagster.core.errors.DagsterResourceFunctionError: Error executing resource_fn on ResourceDefinition io_manager

Matt Callaway

05/28/2021, 2:43 PM

Copy code

{"__class__": "DagsterEvent", "event_specific_data": {"__class__": "PipelineFailureData", "error": {"__class__": "SerializableErrorInfo", "cause": {"__class__": "SerializableErrorInfo", "cause": null, "cls_name": "NoCredentialsError", "message": "botocore.exceptions.NoCredentialsError: Unable to locate credentials\n", "stack": ["  File \"/usr/local/lib/python3.7/site-packages/dagster/core/errors.py\", line 184, in user_code_error_boundary\n    yield\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 281, in single_resource_event_generator\n    resource_or_gen = resource_def.resource_fn(context)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster_aws/s3/io_manager.py\", line 114, in s3_pickle_io_manager\n    pickled_io_manager = PickledObjectS3IOManager(s3_bucket, s3_session, s3_prefix=s3_prefix)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster_aws/s3/io_manager.py\", line 17, in __init__\n    self.s3.head_bucket(Bucket=self.bucket)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/client.py\", line 386, in _api_call\n    return self._make_api_call(operation_name, kwargs)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/client.py\", line 692, in _make_api_call\n    operation_model, request_dict, request_context)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/client.py\", line 711, in _make_request\n    return self._endpoint.make_request(operation_model, request_dict)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/endpoint.py\", line 102, in make_request\n    return self._send_request(request_dict, operation_model)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/endpoint.py\", line 132, in _send_request\n    request = self.create_request(request_dict, operation_model)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/endpoint.py\", line 116, in create_request\n    operation_name=operation_model.name)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/hooks.py\", line 356, in emit\n    return self._emitter.emit(aliased_event_name, **kwargs)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/hooks.py\", line 228, in emit\n    return self._emit(event_name, kwargs)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/hooks.py\", line 211, in _emit\n    response = handler(**kwargs)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/signers.py\", line 90, in handler\n    return self.sign(operation_name, request)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/signers.py\", line 162, in sign\n    auth.add_auth(request)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/auth.py\", line 373, in add_auth\n    raise NoCredentialsError()\n"]}, "cls_name": "DagsterResourceFunctionError", "message": "dagster.core.errors.DagsterResourceFunctionError: Error executing resource_fn on ResourceDefinition io_manager\n", "stack": ["  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/api.py\", line 762, in pipeline_execution_iterator\n    for event in pipeline_context.executor.execute(pipeline_context, execution_plan):\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/executor/in_process.py\", line 50, in execute\n    output_capture=pipeline_context.output_capture,\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/api.py\", line 836, in __iter__\n    yield from self.execution_context_manager.prepare_context()\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/utils/__init__.py\", line 430, in generate_setup_events\n    obj = next(self.generator)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/context_creation_pipeline.py\", line 282, in execution_context_event_generator\n    yield from resources_manager.generate_setup_events()\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/utils/__init__.py\", line 430, in generate_setup_events\n    obj = next(self.generator)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 230, in resource_initialization_event_generator\n    pipeline_def_for_backwards_compat=pipeline_def_for_backwards_compat,\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 182, in _core_resource_initialization_event_generator\n    raise dagster_user_error\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 153, in _core_resource_i
nitialization_event_generator\n    for event in manager.generate_setup_events():\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/utils/__init__.py\", line 430, in generate_setup_events\n    obj = next(self.generator)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 298, in single_resource_event_generator\n    raise dagster_user_error\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 292, in single_resource_event_generator\n    \"Resource generator {name} must yield one item.\".format(name=resource_name)\n", "  File \"/usr/local/lib/python3.7/contextlib.py\", line 130, in __exit__\n    self.gen.throw(type, value, traceback)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/errors.py\", line 193, in user_code_error_boundary\n    ) from e\n"]}}, "event_type_value": "PIPELINE_FAILURE", "logging_tags": {}, "message": "Execution of pipeline \"example_pipe\" failed. An exception was thrown during execution.", "pid": 1, "pipeline_name": "example_pipe", "solid_handle": null, "step_handle": null, "step_key": null, "step_kind_value": null}

Matt Callaway

05/28/2021, 2:43 PM

also:

johann

05/28/2021, 2:47 PM

Sorry I should have clarified- the screenshot you sent had log messages with

dagster-run-06d59190-…

Matt Callaway

05/28/2021, 2:48 PM

logs.txt

johann

05/28/2021, 2:51 PM

Thanks, and could you send the dagit debug export for that run as well?

Matt Callaway

05/28/2021, 2:52 PM

How does that work? Sorry.

Copy code

kc exec dagster-run-06d59190-0d41-4523-b2aa-f03b86ac185e-xhg7f -- bash
error: cannot exec into a container in a completed pod; current phase is Succeeded

Matt Callaway

05/28/2021, 2:54 PM

aha:

Copy code

kc exec dagster-dagit-67cdffbdd8-zrnhj -- dagster debug export a457f2ba outfile.gzip

Matt Callaway

05/28/2021, 2:55 PM

outfile.gzip

Matt Callaway

05/28/2021, 2:58 PM

At this point I have a handful of failed runs with me trying different things. They are all missing AWS creds I think. That outfile.gzip maps to this config:

Copy code

resources:
  io_manager:
    config:
      s3_bucket: test-bucket
  s3:
    config:
      endpoint_url: <http://localhost:9000>
      profile_name: minio
      region_name: us-east-1
solids:
  multiply_the_word:
    config:
      factor: 0
    inputs:
      word: ''

where the profile_name is probably not found.

Matt Callaway

05/28/2021, 2:59 PM

Here’s a prior run where that wasn’t present:

run1.gzip

Matt Callaway

05/28/2021, 3:00 PM

strange that those don’t appear in the dagit UI.

Matt Callaway

05/28/2021, 3:04 PM

Here I show that I have supplied AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to the environment:

Copy code

> kubectl get configmap dagster-dagster-user-deployments-k8s-dagster-lia-user-env -o json | jq '.data'
{
  "AWS_ACCESS_KEY_ID": "minioadmin",
  "AWS_SECRET_ACCESS_KEY": "minioadmin",
  "DAGSTER_HOME": "/opt/dagster/dagster_home",
  "DAGSTER_K8S_INSTANCE_CONFIG_MAP": "dagster-dagster-user-deployments-instance",
  "DAGSTER_K8S_PG_PASSWORD_SECRET": "dagster-postgresql-secret",
  "DAGSTER_K8S_PIPELINE_RUN_ENV_CONFIGMAP": "dagster-dagster-user-deployments-pipeline-env",
  "DAGSTER_K8S_PIPELINE_RUN_NAMESPACE": "dagster"
}

johann

05/28/2021, 3:07 PM

This is strange, when I load your debug file I can see the errors

johann

05/28/2021, 3:09 PM

There might be something going on with websockets… it sounds like you haven’t done as many runs in dagit outside of k8s but did you notice anything there?

Matt Callaway

05/28/2021, 3:10 PM

No. When I click the dagit button to go to raw logs, it just spins, loading… Trying to find some logs on what dagit is doing, but

kubectl logs dagster-dagit-67cdffbdd8-zrnhj

doesn’t seem to have any “live updates”.

Matt Callaway

05/28/2021, 3:18 PM

Trying to scale the dagit pod to 0 then back to 1 to restart. Waiting for it to come back up.

johann

05/28/2021, 3:19 PM

When I click the dagit button to go to raw logs, it just spins, loading…

This is an easy pitfall, the raw logs get stored by the computeLogManager (configured in values.yaml) and by default it’s not accessible by dagit. It needs to use s3/gcs/minio again for those logs.

Matt Callaway

05/28/2021, 3:20 PM

I’ve got a fresh dagit pod up, and running the workflow fails on lack of credentials, but I can see the logs again.

Matt Callaway

05/28/2021, 3:20 PM

I think my only problem at this point is why it’ can’t get credentials.

johann

05/28/2021, 3:20 PM

The error message? Or the raw logs?

Matt Callaway

05/28/2021, 3:21 PM

The error message is now visible in logs:

Copy code

botocore.exceptions.NoCredentialsError: Unable to locate credentials

Matt Callaway

05/28/2021, 3:21 PM

I’ll just not expect the “raw logs”.

Matt Callaway

05/28/2021, 3:25 PM

Is this not sufficient in

values.yml

to provide the S3 credentials:

Copy code

dagster-user-deployments:
  enabled: true
  deployments:
    - name: "k8s-dagster-lia"
      image:
        repository: "<http://docker.io/dagster/user-code-example|docker.io/dagster/user-code-example>"
        tag: latest
        pullPolicy: Always
      dagsterApiGrpcArgs:
        - "-f"
        - "/example_project/example_repo/repo.py"
      port: 3030
      env:
        AWS_ACCESS_KEY_ID: minioadmin
        AWS_SECRET_ACCESS_KEY: minioadmin

johann

05/28/2021, 3:25 PM

I’m looking back at https://dagster.slack.com/archives/CCCR6P2UR/p1611970383144800 as an example of other users using minio

johann

05/28/2021, 3:28 PM

Ah- it’s a bit misleading but those envs won’t be used for the run. Instead you can configure that here https://github.com/dagster-io/dagster/blob/master/helm/dagster/values.yaml#L388-L395

👍 1

johann

05/28/2021, 3:29 PM

And the configmap can be created under

extraManifests

johann

05/28/2021, 3:30 PM

user-deployments create servers that provide dagit with the metadata about your pipelines. In the default k8s deployment, the actual execution takes place in a separate k8s job

Matt Callaway

05/28/2021, 3:32 PM

It’s not clear to me how an envConfigMap is supposed to look.

Matt Callaway

05/28/2021, 3:33 PM

Also the docs explicitly say

env

works:

Copy code

To enable Dagster to connect to S3, provide AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables via the env, envConfigMaps, or envSecrets fields under userDeployments in values.yaml

Matt Callaway

05/28/2021, 3:37 PM

The short snipped in the values.yaml comments shows:

Copy code

envConfigMaps:
         - name: config-map

as compared to the referenced k8s doc that has:

Copy code

apiVersion: v1
kind: ConfigMap
metadata:
  name: special-config
  namespace: default
data:
  SPECIAL_LEVEL: very
  SPECIAL_TYPE: charm

So how should values.yaml look?

Copy code

envConfigMaps:
         - name: config-map
           data:
             AWS_ACCESS_KEY_ID: minioadmin
             AWS_SECRET_ACCESS_KEY: minioadmin

johann

05/28/2021, 3:44 PM

That’s an oversight in the docs, thank you for pointing it out

Matt Callaway

05/28/2021, 3:45 PM

What does a complete

envConfigMaps

entry look like?

Varun

05/28/2021, 3:50 PM

Hi @Matt Callaway, you create a Kubernetes configmap as usual.

Copy code

apiVersion: v1
kind: ConfigMap
metadata:
  name: special-config
  namespace: default
data:
  SPECIAL_LEVEL: very
  SPECIAL_TYPE: charm

and then specify its name in the

envConfigMaps

section of

values.yaml

like this.

Copy code

envConfigMaps:
  - name: special-config

plus1 1

johann

05/28/2021, 3:53 PM

And just to clarify the above, the configmap definition can go in

extraManifests

of the values.yaml (so it gets created alongside the rest of the k8s resources) and then then second block goes in

Copy code

runLauncher:
  type: type: K8sRunLauncher
  config:
    k8sRunLauncher:
      envConfigMaps:
        - name: special-config

Matt Callaway

05/28/2021, 4:01 PM

Thanks @Varun and @johann though I must admit this is not at all clear. The default

values.yaml

says that Config Maps are made from the

env

section:

Copy code

# Additional environment variables to set.
  # A Kubernetes ConfigMap will be created with these environment variables. See:
  # <https://kubernetes.io/docs/concepts/configuration/configmap/>
  #
  # Example:
  #
  # env:
  #   ENV_ONE: one
  #   ENV_TWO: two

But then @johann says to use

extraManifests

, which I would guess to look like this:

Copy code

dagster-user-deployments:
  enabled: true
  deployments:
    - name: "k8s-dagster-lia"
      image:
        repository: "<http://docker.io/dagster/user-code-example|docker.io/dagster/user-code-example>"
        tag: latest
        pullPolicy: Always
      dagsterApiGrpcArgs:
        - "-f"
        - "/example_project/example_repo/repo.py"
      port: 3030
      envConfigMaps:
         - name: aws-config-map

extraManifests:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: aws-config-map
      namespace: dagster
    data:
      AWS_ACCESS_KEY_ID: minioadmin
      AWS_SECRET_ACCESS_KEY: minioadmin

(Note that I’ve used a

dagster

namespace that I created with `kubectl`so I think that’s right, given

--namespace dagster

shows the right services:

Copy code

> kubectl get services --namespace dagster
NAME                          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
dagster-dagit                 ClusterIP   10.106.3.199     <none>        80/TCP     18h
dagster-postgresql            ClusterIP   10.100.82.214    <none>        5432/TCP   18h
dagster-postgresql-headless   ClusterIP   None             <none>        5432/TCP   18h
k8s-dagster-lia               ClusterIP   10.110.102.175   <none>        3030/TCP   18h

But then @johann suggests that I use

runLauncher

too? So then does that mean my example looks like this?

Copy code

dagster-user-deployments:
  enabled: true
  deployments:
    - name: "k8s-dagster-lia"
      image:
        repository: "<http://docker.io/dagster/user-code-example|docker.io/dagster/user-code-example>"
        tag: latest
        pullPolicy: Always
      dagsterApiGrpcArgs:
        - "-f"
        - "/example_project/example_repo/repo.py"
      port: 3030
      envConfigMaps:
         - name: aws-config-map

extraManifests:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: aws-config-map
      namespace: dagster
    data:
      AWS_ACCESS_KEY_ID: minioadmin
      AWS_SECRET_ACCESS_KEY: minioadmin

runLauncher:
  type: K8sRunLauncher
  config:
    k8sRunLauncher:
      envConfigMaps:
        - name: aws-config-map

Matt Callaway

05/28/2021, 4:02 PM

(And I also note that I really should be using

secrets

instead of Config Maps… but I’ll save that for later.)

Matt Callaway

05/28/2021, 4:12 PM

Seems to be getting closer:

Copy code

botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "<http://localhost:9000/test-bucket>"

Matt Callaway

05/28/2021, 4:13 PM

This is now probably a network thing, right? As the containers aren’t going to be able to “see” localhost port 9000.

Matt Callaway

05/28/2021, 4:13 PM

From my command line I can see that minio is working:

Copy code

> aws --endpoint-url <http://localhost:9000> s3 ls <s3://test-bucket/>
2021-05-28 09:12:01         29 date1.txt

johann

05/28/2021, 4:15 PM

You’re very right that isn’t clear, I appreciate you going through it. It’s definitely pointed out a lot of things for us to do already, though if you’d like to sum them up in a bit of feedback or create gh issues that would be excellent.

Matt Callaway

05/28/2021, 4:16 PM

I can follow up with a feedback issue if you think it would be helpful. I think the last thing for me to do here is to somehow expose localhost port 9000 inside the pod. Is there a known solution for that?

Matt Callaway

05/28/2021, 4:18 PM

Oh maybe I just use

<http://host.docker.internal:9000>

johann

05/28/2021, 4:18 PM

Yes- you need to create a service for the pods to reach minio with. An example here https://dagster.slack.com/archives/CCCR6P2UR/p1611970383144800 was that a user ran minio inside k8s and thus just had a normal service to point to it

Matt Callaway

05/28/2021, 4:22 PM

Success!

Copy code

resources:
  io_manager:
    config:
      s3_bucket: test-bucket
  s3:
    config:
      endpoint_url: <http://host.docker.internal:9000>
      region_name: us-east-1
solids:
  multiply_the_word:
    config:
      factor: 0
    inputs:
      word: ''

Matt Callaway

05/28/2021, 4:22 PM

Last question for this thread… This configuration YAML I just posted, should that also go into

values.yaml

johann

05/28/2021, 4:23 PM

Nice! I think another way would be to use the minio helm chart so that you don’t have to use the host.docker.internal address, but it’s great that this is working. We should add documentation for getting set up with minio

johann

05/28/2021, 4:24 PM

should that also go into
values.yaml
?

You should have a separate file (can be named

values.yaml

or otherwise) that stores your overrides of our default values. You specify your file when you do

helm upgrade -f <file>

Matt Callaway

05/28/2021, 4:26 PM

Yes I’ve been pasting my values.yaml above, but so far it has not included a

resources:

section.

Matt Callaway

05/28/2021, 4:26 PM

Does the default run configuration go in the

dagit:

section of

values.yaml

Matt Callaway

05/28/2021, 4:27 PM

If that’s correct, then my full working config looks like this:

Copy code

dagster-user-deployments:
  enabled: true
  deployments:
    - name: "k8s-dagster-lia"
      image:
        repository: "<http://docker.io/dagster/user-code-example|docker.io/dagster/user-code-example>"
        tag: latest
        pullPolicy: Always
      dagsterApiGrpcArgs:
        - "-f"
        - "/example_project/example_repo/repo.py"
      port: 3030
      envConfigMaps:
         - name: aws-config-map

extraManifests:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: aws-config-map
      namespace: dagster
    data:
      AWS_ACCESS_KEY_ID: minioadmin
      AWS_SECRET_ACCESS_KEY: minioadmin

runLauncher:
  type: K8sRunLauncher
  config:
    k8sRunLauncher:
      envConfigMaps:
        - name: aws-config-map

dagit:
  resources:
    io_manager:
      config:
        s3_bucket: test-bucket
    s3:
      config:
        endpoint_url: <http://host.docker.internal:9000>
        region_name: us-east-1

  solids:
    multiply_the_word:
      config:
        factor: 0
      inputs:
        word: ''

Matt Callaway

05/28/2021, 4:27 PM

Is it correct to include

resources

and

solids

inside the

dagit

section?

johann

05/28/2021, 4:32 PM

Ah sorry I misunderstood. The resources yaml you posted is pipeline run config- it goes either in your pipeline definition (in the python code) as a preset, or you can specify it at run time in the dagit playground

Matt Callaway

05/28/2021, 4:34 PM

Ok. Great to see it working! Thank you for spending so much time with me on this. I will collect my notes and create a github issue suggesting some documentation updates and the creation of a “cookbook” of user stories.

thankyou 1

Matt Callaway

05/28/2021, 5:50 PM

https://github.com/dagster-io/dagster/issues/4233 https://github.com/dagster-io/dagster/issues/4234

12 Views

Open in Slack

Previous Next