Hey guys I ve done a bit of digging around but haven t yet f dagster #ask-community

Hey guys, I’ve done a bit of digging around but ha...

Jonas De Beukelaer

01/17/2022, 12:29 PM

Hey guys, I’ve done a bit of digging around but haven’t yet found a good way to deal with secrets locally vs in kubernetes. Loading secrets as env vars in the config seems like the best (secure) way, but this has a couple problems with it: 1. If running dagit locally, I need to ensure all env var references are fulfilled even if I only want to run/test one job a. to get around this I would maybe: i. need some scripting to fill all required secrets before dagit launches? ii. Or maybe target only one job at a time with dagit - Is this possible? 2. When running in kubernetes, even though I only run jobs within their own kubernetes job, it seems dagit still requires all env vars to be present before it will startup. a. Why is this the case? I don’t really want to load every single secrets I may need to reference in my jobs into the dagit pod. b. I would like to only reference them from within the kubernetes job’s pod (where I can add a

dagster-k8s/config

tag to attach them) c. also currently it means I would need to reload dagit each time I create a new job requiring new secrets, instead of just reloading the relevant dagster-user-deployment 3. Or, since my secrets currently live in AWS Secrets Manager - could it make sense to simply create an op to pull these as part of the graph? I think this might work well locally too. Hmm writing this out may have helped me find a solution 😄 Still interested to hear the dagster team’s take on this

Anatoly Laskaris

01/17/2022, 1:35 PM

I'm not from dagster team sorry 🙂 But I would suggest to use HashiCorp Vault (https://www.hashicorp.com/products/vault) for secret storage/management. Secrets can be accessed based on tokens - each token has a ttl and policy associated with it. App in K8S, CI/CD system and local run can access secrets in the same way K8S, various CI/CD tools and python, of course, have vault integration.

Alex Service

01/17/2022, 2:15 PM

For myself, I’ve found

direnv

to be sufficient so far. My flow is simply: • create

.envrc

with whatever secrets/paths to secrets I need (

export WHATEVER=/path/to/whatever

) • If using docker, pass env vars through a

docker-compose.yml

so spin-up and clean-up are easy • For deployments, rely on my devops team and put the secrets wherever they tell me lol

Jonas De Beukelaer

01/17/2022, 4:48 PM

@Anatoly Laskaris do you use vault to access your secrets within dagster I take it? Do you get the secrets from vault in purpose built `op`s or do you just run it as part of the code that needs the secret?

Jonas De Beukelaer

01/17/2022, 4:49 PM

@Alex Service thanks for the suggestion, I’ll check this out 🙂

Anatoly Laskaris

01/17/2022, 5:11 PM

@Jonas De Beukelaer not using it with dagster yet. For me personally since developers will not be running code locally, I think I will go with something that gets secrets from Vault and provides to pipeline with env variables. But it can be done with code using hvac just as easily, I think.

👍 1

daniel

01/17/2022, 6:58 PM

Hi Jonas - we've been thinking lately about what kinds of improvements we can make on the secrets management front so this feedback is helpful. One thing I wanted to clarify is what exactly you mean by 'the config' in 'Loading secrets as env vars in the config' - which config exactly are you referring to here / do you have an example? In general Dagit shouldn't need to load your code or your secrets unless they are needed to, say, load your dagster.yaml file (for example, sourcing your postgres password from an environment variable)

Jonas De Beukelaer

01/18/2022, 9:17 AM

@daniel thanks for your reply, look forward to seeing what you guys come up with 🙂 Re: ‘Loading secrets as env vars in the config’ - by using StringSource to reference an environment variable. e.g.

Copy code

ops:
  send_to_endpoint:
    config:
      api_endpoint: 'api.alchera.tech/v1.0'
      api_key:
         env: MY_ENV_VAR

Also there is one inaccuracy in my initial message - in k8s the dagit pod is not the one expecting the env var to be present, it’s the user-deployment grpc server instead (so my point 2.c. is not an issue)

daniel

01/18/2022, 1:51 PM

I see - Yeah, the only pod that should need the environment variable there is the one doing the actual run execution

3 Views

Open in Slack

Previous Next