Hi I try to use GCS for intermediate storage in th...
# announcements
r
Hi I try to use GCS for intermediate storage in the K8s Celery example. It seams the https://docs.dagster.io/deploying/gcp is not working on the K8s Helm deployment. (It works locally) Appreciate any help / hint.
n
Do you get a specific error?
r
n
Are you using GKE workload ID?
r
I tried to mount the Service Account to each deployment via the google example here. https://cloud.google.com/kubernetes-engine/docs/tutorials/authenticating-to-cloud-platform#kubectl
Unfortunately this was not passed on to the K8s job.
n
You wouldn't want to use that anyway, it's been replaced by WorkloadID
Check if that is enabled on your cluster
This will be kind of awkward since the default helm chart doesn't allow setting a service account for the launched celery pods
r
OK, so the preferred way is to link the (helm created) K8s Service Account to the GoogleService account, right? I guess then I’ll have to link them right after the helm install.
n
Let's rewind, which executor module are you using specifically
Theres dagster_celery, dagster_k8s, and dagster_k8s_celery
r
Here is my Pipeline code.
n
Okay yeah
So step 1, make sure workload ID is enabled
Step 2, create a service account for your worker jobs that has the annotation for WorkloadID
Step 3, set up a Google IAM service account and policy binding to allow access to the buckets you want
Step 4, add
.configured({'service_account_name': 'whatever'})
on the executor definition in your pipeline code
r
Thanks a lot. I’ll come back once I have done all those steps. Looking forward to get it finally running 🤩
n
It can be a bit annoying the first time but you get used to the setup. I really need to wrap it into an operator at some point
👍 1
r
Works now 🤩 Thanks very much.