https://dagster.io/ logo
#deployment-kubernetes
Title
# deployment-kubernetes
m

Matt Callaway

05/28/2021, 12:44 PM
I’m trying to run through the Walkthrough re: getting dagster deployed via helm to k8s. https://docs.dagster.io/deployment/guides/kubernetes/deploying-with-helm I’m running k8s on Docker for Mac. Everything appears to be working, but Dagit shows the workflow with no configuration. I follow my nose to let it write a default config and it comes up with this:
Copy code
resources:
  io_manager:
    config:
      s3_bucket: ""
solids:
  multiply_the_word:
    config:
      factor: 0
    inputs:
      word: ""
I launch the execution and it fails saying “” is an illegal bucket name. I fill in “test-bucket” and it fails “Unable to locate credentials”. This is of course local to my mac, so there shouldn’t be any credentials or any attempt to touch s3. It’s a walkthrough. Why would the walk through expect to use a real s3 bucket? Shouldn’t a demo expect to use the
fs_io_manager
? Can someone provide guidance on: • How to get the walkthrough to work? • How to specify the difference between a local dev/demo instance and a “real” instance on AWS? • How to deploy changes to config and code in a k8s environment?
j

johann

05/28/2021, 12:57 PM
Hi Matt, thank you for sharing your experience getting set up. I think your concerns point to gaps in documentation and maybe in some features, but I do want to give some framing: Generally haven’t built the system/written docs with the expectation that users do local testing using a kubernetes cluster. If you start Dagit as just a regular process on your machine, it of course works to use the
fs_io_manager
. Generally the helm/kubernetes part of the system serve as an option for users to productionize their dagster deployment.
m

Matt Callaway

05/28/2021, 1:03 PM
That makes sense, but I’d say that k8s is intended for “cloud native” infrastructure, meaning “It should run in any kubernetes deployment”, which could include k8s on mac, or GCP, or certainly “don’t expect AWS”.
My most important initial goal with dagster is “run the same workflow on my mac as I would in the cloud” and to that end I’m trying to understand the “environment” or “infrastructural” set up parts as soon as I can. Any guidance would be really appreciated.
j

johann

05/28/2021, 1:09 PM
As you point out, there definitely can be utility in running kubernetes locally. For io_managers, there are a few options: • If you are using the standard k8s deployment (non-celery) then by default the full pipeline run takes place within one pod in a single process. In this situation,
fs_io_manager
or
mem_io_manager
work because there’s no isolation. If you use the multiprocess executor, then they’ll still be within a pod with a shared file system so
fs_io_manager
works • If you’re using an executor that isolates each solid to its own pod (such as
celery_k8s_job_executor
or the new
k8s_job_executor
), then
fs_io_manager
won’t work because pods generally won’t have access to a shared file system. You can set up a shared volume, but that’s generally not best practice/not recommended for production.
m

Matt Callaway

05/28/2021, 1:12 PM
I’m using the standard k8s deployment. So `fs_io_manager`or
mem_io_manager
should work. How do I make use of them? If I update the config and change
Copy code
resources:
  io_manager:
    config:
      s3_bucket: "test-bucket"
to something like:
Copy code
resources:
  io_manager:
    config:
      fs_io_manager:
         ...
it shows a warning that it expects s3_bucket… as if fs_io_manager isn’t present as an option. How do I make it use one of these other IO managers?
j

johann

05/28/2021, 1:12 PM
“It should run in any kubernetes deployment”, which could include k8s on mac, or GCP
Agreed. We have io_managers for gcs and s3. For local k8s, some users use minio
How do I make it use one of these other IO managers?
This is confusing- the io_managers are defined a resources, which are selected using pipeline mode. In the dagit playground there’s a mode dropdown above where you were writing run config.
The pipeline definition is here https://github.com/dagster-io/dagster/blob/master/examples/deploy_k8s/example_project/example_repo/repo.py#L28-L59 The
default
mode sets the io_manager to s3. As you’ve pointed out this isn’t ideal, it’d be great if you could file a gh issue to fix that.
test
mode leaves the io_manager to the system default,
mem_io_manager
m

Matt Callaway

05/28/2021, 1:46 PM
Ok yes, switching to “test” mode allows the pipeline to run. Oddly, I launch the run and it shows Success, but it looks like it’s still “doing something”. The run says it took 0.241s but the “timeline” view is still in motion as if it’s still doing something.
j

johann

05/28/2021, 1:49 PM
Sounds like a bug, could go ahead and file that as well.
m

Matt Callaway

05/28/2021, 1:49 PM
Not really sure what that was about. I navigated away from that view and launched a fresh run and it looks right. Finished with Success in 0.200s. It seems to succeed, though the steps don’t turn green, and I’m not seeing it log any output.
Anyway, seeing how the “test” mode selects IO manager is helpful. Thanks for that.
I’ll explore minio as well.
j

johann

05/28/2021, 1:54 PM
My most important initial goal with dagster is “run the same workflow on my mac as I would in the cloud” and to that end I’m trying to understand the “environment” or “infrastructural” set up parts as soon as I can
Overall our approach here is to give you knobs on each part of the system that interacts with environment/infra, so that you can choose the simple approach (e.g. mem_io_manager here) when possible and use the more complicated one when necessary
m

Matt Callaway

05/28/2021, 1:56 PM
That’s an attractive approach. It seems apparent to me already, in these early stages, that it’s easy for me to do easy things (I have a pipeline running in a local
pip installed
dagit already), and my imagination sees where to go in moving that easy thing into a more complex “real” infrastructure. The difficulty is in finding examples that help me get there. Having a “cookbook” of examples would be really helpful.
j

johann

05/28/2021, 2:04 PM
Gotcha.!
How to specify the difference between a local dev/demo instance and a “real” instance on AWS?
We do have a large set of knobs that have to be turned. The two main ones to consider are the instance (
dagster.yaml
) which controls system-wide settings and presets/modes which control individual pipelines. The defaults for the instance are good for local development, and if you’re using helm we set up a production dagster.yaml for you (it uses postgres for storage, kubernetes run launcher, etc). Pipelines need multiple presets and modes so they can have easy local execution (inprocess or multiprocess executor, mem or fs storage), plus whatever other options you need for production.
👍 1
How to deploy changes to config and code in a k8s environment?
Sorry if this wasn’t what you were asking-
helm upgrade
is how you can deploy new changes. When you’re working on dagster pipeline code, you’ll want a deploy process that builds a new image (with a new tag) and `helm upgrade`’s with the new image tag. Dagit and other parts of the system don’t need to change when you update your pipeline code, you only need to change the image for your user-deployments.
m

Matt Callaway

05/28/2021, 2:18 PM
I installed minio and have it listening. So I wanted to set the aws credentials for my now running k8s install. I updated
values.yaml
to make changes:
Copy code
env:
        AWS_ACCESS_KEY_ID: minioadmin
        AWS_SECRET_ACCESS_KEY: minioadmin
And ran the helm upgrade. I’m now switching back from “test” to “default”, and launching a run. It shows failure, but the logs include no errors.
So the dagster.yaml lives in the container or is that within the values.yaml?
j

johann

05/28/2021, 2:24 PM
It shows failure, but the logs include no errors.
This is really strange. Would it be possible for you to send the logs from that
dagster-run-…
job?
So the dagster.yaml lives in the container or is that within the values.yaml?
In the helm case, we generate it based on your values.yaml. https://github.com/dagster-io/dagster/blob/master/helm/dagster/templates/configmap-instance.yaml
👍 1
m

Matt Callaway

05/28/2021, 2:30 PM
Looking for logs
j

johann

05/28/2021, 2:36 PM
It seems like you’re running in to a couple strange behaviors with the event log. I haven’t tried running k8s in docker on mac, it might be possible that you’re hitting strange behavior with dagit websockets or something like that
A diagnostic that could be helpful is
dagster debug export <run ID> output_file.gzip
(as an exec to the dagit pod). That will include the raw events from the database, if you share it with me I could check if we’re also missing the events
(
kubectl exec
and
kubectl cp
are useful here, lmk if you need any help)
m

Matt Callaway

05/28/2021, 2:42 PM
kubectl get pods
shows me a set of 14 pods named
dagster-run-…
Iterating over them with
kubectl logs $POD
I see a few different sorts of error. This one I was expecting:
Copy code
botocore.exceptions.NoCredentialsError: Unable to locate credentials
as I’m trying to supply creds for my new minio setup. But then also there’s
Copy code
2021-05-28 14:17:12 - dagster - ERROR - example_pipe - 06d59190-0d41-4523-b2aa-f03b86ac185e - 1 - PIPELINE_FAILURE - Execution of pipeline "example_pipe" failed. An exception was thrown during execution.

dagster.core.errors.DagsterResourceFunctionError: Error executing resource_fn on ResourceDefinition io_manager
Copy code
{"__class__": "DagsterEvent", "event_specific_data": {"__class__": "PipelineFailureData", "error": {"__class__": "SerializableErrorInfo", "cause": {"__class__": "SerializableErrorInfo", "cause": null, "cls_name": "NoCredentialsError", "message": "botocore.exceptions.NoCredentialsError: Unable to locate credentials\n", "stack": ["  File \"/usr/local/lib/python3.7/site-packages/dagster/core/errors.py\", line 184, in user_code_error_boundary\n    yield\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 281, in single_resource_event_generator\n    resource_or_gen = resource_def.resource_fn(context)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster_aws/s3/io_manager.py\", line 114, in s3_pickle_io_manager\n    pickled_io_manager = PickledObjectS3IOManager(s3_bucket, s3_session, s3_prefix=s3_prefix)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster_aws/s3/io_manager.py\", line 17, in __init__\n    self.s3.head_bucket(Bucket=self.bucket)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/client.py\", line 386, in _api_call\n    return self._make_api_call(operation_name, kwargs)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/client.py\", line 692, in _make_api_call\n    operation_model, request_dict, request_context)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/client.py\", line 711, in _make_request\n    return self._endpoint.make_request(operation_model, request_dict)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/endpoint.py\", line 102, in make_request\n    return self._send_request(request_dict, operation_model)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/endpoint.py\", line 132, in _send_request\n    request = self.create_request(request_dict, operation_model)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/endpoint.py\", line 116, in create_request\n    operation_name=operation_model.name)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/hooks.py\", line 356, in emit\n    return self._emitter.emit(aliased_event_name, **kwargs)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/hooks.py\", line 228, in emit\n    return self._emit(event_name, kwargs)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/hooks.py\", line 211, in _emit\n    response = handler(**kwargs)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/signers.py\", line 90, in handler\n    return self.sign(operation_name, request)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/signers.py\", line 162, in sign\n    auth.add_auth(request)\n", "  File \"/usr/local/lib/python3.7/site-packages/botocore/auth.py\", line 373, in add_auth\n    raise NoCredentialsError()\n"]}, "cls_name": "DagsterResourceFunctionError", "message": "dagster.core.errors.DagsterResourceFunctionError: Error executing resource_fn on ResourceDefinition io_manager\n", "stack": ["  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/api.py\", line 762, in pipeline_execution_iterator\n    for event in pipeline_context.executor.execute(pipeline_context, execution_plan):\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/executor/in_process.py\", line 50, in execute\n    output_capture=pipeline_context.output_capture,\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/api.py\", line 836, in __iter__\n    yield from self.execution_context_manager.prepare_context()\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/utils/__init__.py\", line 430, in generate_setup_events\n    obj = next(self.generator)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/context_creation_pipeline.py\", line 282, in execution_context_event_generator\n    yield from resources_manager.generate_setup_events()\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/utils/__init__.py\", line 430, in generate_setup_events\n    obj = next(self.generator)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 230, in resource_initialization_event_generator\n    pipeline_def_for_backwards_compat=pipeline_def_for_backwards_compat,\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 182, in _core_resource_initialization_event_generator\n    raise dagster_user_error\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 153, in _core_resource_i
nitialization_event_generator\n    for event in manager.generate_setup_events():\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/utils/__init__.py\", line 430, in generate_setup_events\n    obj = next(self.generator)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 298, in single_resource_event_generator\n    raise dagster_user_error\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/execution/resources_init.py\", line 292, in single_resource_event_generator\n    \"Resource generator {name} must yield one item.\".format(name=resource_name)\n", "  File \"/usr/local/lib/python3.7/contextlib.py\", line 130, in __exit__\n    self.gen.throw(type, value, traceback)\n", "  File \"/usr/local/lib/python3.7/site-packages/dagster/core/errors.py\", line 193, in user_code_error_boundary\n    ) from e\n"]}}, "event_type_value": "PIPELINE_FAILURE", "logging_tags": {}, "message": "Execution of pipeline \"example_pipe\" failed. An exception was thrown during execution.", "pid": 1, "pipeline_name": "example_pipe", "solid_handle": null, "step_handle": null, "step_key": null, "step_kind_value": null}
also:
j

johann

05/28/2021, 2:47 PM
Sorry I should have clarified- the screenshot you sent had log messages with
dagster-run-06d59190-…
m

Matt Callaway

05/28/2021, 2:48 PM
j

johann

05/28/2021, 2:51 PM
Thanks, and could you send the dagit debug export for that run as well?
m

Matt Callaway

05/28/2021, 2:52 PM
How does that work? Sorry.
Copy code
kc exec dagster-run-06d59190-0d41-4523-b2aa-f03b86ac185e-xhg7f -- bash
error: cannot exec into a container in a completed pod; current phase is Succeeded
aha:
Copy code
kc exec dagster-dagit-67cdffbdd8-zrnhj -- dagster debug export a457f2ba outfile.gzip
At this point I have a handful of failed runs with me trying different things. They are all missing AWS creds I think. That outfile.gzip maps to this config:
Copy code
resources:
  io_manager:
    config:
      s3_bucket: test-bucket
  s3:
    config:
      endpoint_url: <http://localhost:9000>
      profile_name: minio
      region_name: us-east-1
solids:
  multiply_the_word:
    config:
      factor: 0
    inputs:
      word: ''
where the profile_name is probably not found.
Here’s a prior run where that wasn’t present:
strange that those don’t appear in the dagit UI.
Here I show that I have supplied AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to the environment:
Copy code
> kubectl get configmap dagster-dagster-user-deployments-k8s-dagster-lia-user-env -o json | jq '.data'
{
  "AWS_ACCESS_KEY_ID": "minioadmin",
  "AWS_SECRET_ACCESS_KEY": "minioadmin",
  "DAGSTER_HOME": "/opt/dagster/dagster_home",
  "DAGSTER_K8S_INSTANCE_CONFIG_MAP": "dagster-dagster-user-deployments-instance",
  "DAGSTER_K8S_PG_PASSWORD_SECRET": "dagster-postgresql-secret",
  "DAGSTER_K8S_PIPELINE_RUN_ENV_CONFIGMAP": "dagster-dagster-user-deployments-pipeline-env",
  "DAGSTER_K8S_PIPELINE_RUN_NAMESPACE": "dagster"
}
j

johann

05/28/2021, 3:07 PM
This is strange, when I load your debug file I can see the errors
There might be something going on with websockets… it sounds like you haven’t done as many runs in dagit outside of k8s but did you notice anything there?
m

Matt Callaway

05/28/2021, 3:10 PM
No. When I click the dagit button to go to raw logs, it just spins, loading… Trying to find some logs on what dagit is doing, but
kubectl logs dagster-dagit-67cdffbdd8-zrnhj
doesn’t seem to have any “live updates”.
Trying to scale the dagit pod to 0 then back to 1 to restart. Waiting for it to come back up.
j

johann

05/28/2021, 3:19 PM
When I click the dagit button to go to raw logs, it just spins, loading…
This is an easy pitfall, the raw logs get stored by the computeLogManager (configured in values.yaml) and by default it’s not accessible by dagit. It needs to use s3/gcs/minio again for those logs.
m

Matt Callaway

05/28/2021, 3:20 PM
I’ve got a fresh dagit pod up, and running the workflow fails on lack of credentials, but I can see the logs again.
I think my only problem at this point is why it’ can’t get credentials.
j

johann

05/28/2021, 3:20 PM
The error message? Or the raw logs?
m

Matt Callaway

05/28/2021, 3:21 PM
The error message is now visible in logs:
Copy code
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I’ll just not expect the “raw logs”.
Is this not sufficient in
values.yml
to provide the S3 credentials:
Copy code
dagster-user-deployments:
  enabled: true
  deployments:
    - name: "k8s-dagster-lia"
      image:
        repository: "<http://docker.io/dagster/user-code-example|docker.io/dagster/user-code-example>"
        tag: latest
        pullPolicy: Always
      dagsterApiGrpcArgs:
        - "-f"
        - "/example_project/example_repo/repo.py"
      port: 3030
      env:
        AWS_ACCESS_KEY_ID: minioadmin
        AWS_SECRET_ACCESS_KEY: minioadmin
j

johann

05/28/2021, 3:25 PM
I’m looking back at https://dagster.slack.com/archives/CCCR6P2UR/p1611970383144800 as an example of other users using minio
Ah- it’s a bit misleading but those envs won’t be used for the run. Instead you can configure that here https://github.com/dagster-io/dagster/blob/master/helm/dagster/values.yaml#L388-L395
👍 1
And the configmap can be created under
extraManifests
user-deployments create servers that provide dagit with the metadata about your pipelines. In the default k8s deployment, the actual execution takes place in a separate k8s job
m

Matt Callaway

05/28/2021, 3:32 PM
It’s not clear to me how an envConfigMap is supposed to look.
Also the docs explicitly say
env
works:
Copy code
To enable Dagster to connect to S3, provide AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables via the env, envConfigMaps, or envSecrets fields under userDeployments in values.yaml
The short snipped in the values.yaml comments shows:
Copy code
envConfigMaps:
         - name: config-map
as compared to the referenced k8s doc that has:
Copy code
apiVersion: v1
kind: ConfigMap
metadata:
  name: special-config
  namespace: default
data:
  SPECIAL_LEVEL: very
  SPECIAL_TYPE: charm
So how should values.yaml look?
Copy code
envConfigMaps:
         - name: config-map
           data:
             AWS_ACCESS_KEY_ID: minioadmin
             AWS_SECRET_ACCESS_KEY: minioadmin
??
j

johann

05/28/2021, 3:44 PM
That’s an oversight in the docs, thank you for pointing it out
m

Matt Callaway

05/28/2021, 3:45 PM
What does a complete
envConfigMaps
entry look like?
v

Varun

05/28/2021, 3:50 PM
Hi @Matt Callaway, you create a Kubernetes configmap as usual.
Copy code
apiVersion: v1
kind: ConfigMap
metadata:
  name: special-config
  namespace: default
data:
  SPECIAL_LEVEL: very
  SPECIAL_TYPE: charm
and then specify its name in the
envConfigMaps
section of
values.yaml
like this.
Copy code
envConfigMaps:
  - name: special-config
plus1 1
j

johann

05/28/2021, 3:53 PM
And just to clarify the above, the configmap definition can go in
extraManifests
of the values.yaml (so it gets created alongside the rest of the k8s resources) and then then second block goes in
Copy code
runLauncher:
  type: type: K8sRunLauncher
  config:
    k8sRunLauncher:
      envConfigMaps:
        - name: special-config
m

Matt Callaway

05/28/2021, 4:01 PM
Thanks @Varun and @johann though I must admit this is not at all clear. The default
values.yaml
says that Config Maps are made from the
env
section:
Copy code
# Additional environment variables to set.
  # A Kubernetes ConfigMap will be created with these environment variables. See:
  # <https://kubernetes.io/docs/concepts/configuration/configmap/>
  #
  # Example:
  #
  # env:
  #   ENV_ONE: one
  #   ENV_TWO: two
But then @johann says to use
extraManifests
, which I would guess to look like this:
Copy code
dagster-user-deployments:
  enabled: true
  deployments:
    - name: "k8s-dagster-lia"
      image:
        repository: "<http://docker.io/dagster/user-code-example|docker.io/dagster/user-code-example>"
        tag: latest
        pullPolicy: Always
      dagsterApiGrpcArgs:
        - "-f"
        - "/example_project/example_repo/repo.py"
      port: 3030
      envConfigMaps:
         - name: aws-config-map

extraManifests:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: aws-config-map
      namespace: dagster
    data:
      AWS_ACCESS_KEY_ID: minioadmin
      AWS_SECRET_ACCESS_KEY: minioadmin
(Note that I’ve used a
dagster
namespace that I created with `kubectl`so I think that’s right, given
--namespace dagster
shows the right services:
Copy code
> kubectl get services --namespace dagster
NAME                          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
dagster-dagit                 ClusterIP   10.106.3.199     <none>        80/TCP     18h
dagster-postgresql            ClusterIP   10.100.82.214    <none>        5432/TCP   18h
dagster-postgresql-headless   ClusterIP   None             <none>        5432/TCP   18h
k8s-dagster-lia               ClusterIP   10.110.102.175   <none>        3030/TCP   18h
But then @johann suggests that I use
runLauncher
too? So then does that mean my example looks like this?
Copy code
dagster-user-deployments:
  enabled: true
  deployments:
    - name: "k8s-dagster-lia"
      image:
        repository: "<http://docker.io/dagster/user-code-example|docker.io/dagster/user-code-example>"
        tag: latest
        pullPolicy: Always
      dagsterApiGrpcArgs:
        - "-f"
        - "/example_project/example_repo/repo.py"
      port: 3030
      envConfigMaps:
         - name: aws-config-map

extraManifests:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: aws-config-map
      namespace: dagster
    data:
      AWS_ACCESS_KEY_ID: minioadmin
      AWS_SECRET_ACCESS_KEY: minioadmin

runLauncher:
  type: K8sRunLauncher
  config:
    k8sRunLauncher:
      envConfigMaps:
        - name: aws-config-map
(And I also note that I really should be using
secrets
instead of Config Maps… but I’ll save that for later.)
Seems to be getting closer:
Copy code
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "<http://localhost:9000/test-bucket>"
This is now probably a network thing, right? As the containers aren’t going to be able to “see” localhost port 9000.
From my command line I can see that minio is working:
Copy code
> aws --endpoint-url <http://localhost:9000> s3 ls <s3://test-bucket/>
2021-05-28 09:12:01         29 date1.txt
j

johann

05/28/2021, 4:15 PM
You’re very right that isn’t clear, I appreciate you going through it. It’s definitely pointed out a lot of things for us to do already, though if you’d like to sum them up in a bit of feedback or create gh issues that would be excellent.
m

Matt Callaway

05/28/2021, 4:16 PM
I can follow up with a feedback issue if you think it would be helpful. I think the last thing for me to do here is to somehow expose localhost port 9000 inside the pod. Is there a known solution for that?
Oh maybe I just use
<http://host.docker.internal:9000>
j

johann

05/28/2021, 4:18 PM
Yes- you need to create a service for the pods to reach minio with. An example here https://dagster.slack.com/archives/CCCR6P2UR/p1611970383144800 was that a user ran minio inside k8s and thus just had a normal service to point to it
m

Matt Callaway

05/28/2021, 4:22 PM
Success!
Copy code
resources:
  io_manager:
    config:
      s3_bucket: test-bucket
  s3:
    config:
      endpoint_url: <http://host.docker.internal:9000>
      region_name: us-east-1
solids:
  multiply_the_word:
    config:
      factor: 0
    inputs:
      word: ''
Last question for this thread… This configuration YAML I just posted, should that also go into
values.yaml
?
j

johann

05/28/2021, 4:23 PM
Nice! I think another way would be to use the minio helm chart so that you don’t have to use the host.docker.internal address, but it’s great that this is working. We should add documentation for getting set up with minio
should that also go into 
values.yaml
 ?
You should have a separate file (can be named
values.yaml
or otherwise) that stores your overrides of our default values. You specify your file when you do
helm upgrade -f <file>
m

Matt Callaway

05/28/2021, 4:26 PM
Yes I’ve been pasting my values.yaml above, but so far it has not included a
resources:
section.
Does the default run configuration go in the
dagit:
section of
values.yaml
?
If that’s correct, then my full working config looks like this:
Copy code
dagster-user-deployments:
  enabled: true
  deployments:
    - name: "k8s-dagster-lia"
      image:
        repository: "<http://docker.io/dagster/user-code-example|docker.io/dagster/user-code-example>"
        tag: latest
        pullPolicy: Always
      dagsterApiGrpcArgs:
        - "-f"
        - "/example_project/example_repo/repo.py"
      port: 3030
      envConfigMaps:
         - name: aws-config-map

extraManifests:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: aws-config-map
      namespace: dagster
    data:
      AWS_ACCESS_KEY_ID: minioadmin
      AWS_SECRET_ACCESS_KEY: minioadmin

runLauncher:
  type: K8sRunLauncher
  config:
    k8sRunLauncher:
      envConfigMaps:
        - name: aws-config-map

dagit:
  resources:
    io_manager:
      config:
        s3_bucket: test-bucket
    s3:
      config:
        endpoint_url: <http://host.docker.internal:9000>
        region_name: us-east-1

  solids:
    multiply_the_word:
      config:
        factor: 0
      inputs:
        word: ''
Is it correct to include
resources
and
solids
inside the
dagit
section?
j

johann

05/28/2021, 4:32 PM
Ah sorry I misunderstood. The resources yaml you posted is pipeline run config- it goes either in your pipeline definition (in the python code) as a preset, or you can specify it at run time in the dagit playground
m

Matt Callaway

05/28/2021, 4:34 PM
Ok. Great to see it working! Thank you for spending so much time with me on this. I will collect my notes and create a github issue suggesting some documentation updates and the creation of a “cookbook” of user stories.
thankyou 1
11 Views