I have a nested graph that consists of an op, foll...
# ask-community
j
I have a nested graph that consists of an op, followed by another graph (let’s call this the “*inner graph*”). the op saves a file to google cloud storage and the first op in the inner graph loads it. this ran on a daily schedule without problems until 2 days ago. now the first op of the inner graph (the one that reads the file from google cloud storage) complains with a 403 error
Caller does not have storage.objects.get access to the Google Cloud Storage object.
has anyone else had problems with gcloud & dagster lately?
🤖 1
s
Sounds like maybe some of your credentials for GCS have gone stale? If you haven’t updated Dagster in the last few days, doesn’t seem like a Dagster-specific problem.
j
is there a way to specify in the dagster helm yaml file which service account
K8sRunLauncher
should use?
s
I’m going to need someone more knowledgable to answer that: cc @rex @daniel @johann
d
you can set a service account for the whole chart here, would that be an option?} https://github.com/dagster-io/dagster/blob/master/helm/dagster/values.yaml#L13-L15
j
I just updated to 0.14.16. I tried setting
serviceAccountName
in the values file (the service name is just the bit before the @ in the service account email address right?) I also tried setting
includeConfigInLaunchedRuns: true
in the
dagster-user-deployments
bit of the values file. everything I tried results in the
dagster-user-deployments
pod having the correct service account and the pod that runs my job having a random different one (
PROJECT_ID.svc.id.goog
)
the strange is that that other jobs that are just running on an hourly schedule have the correct service account when I run
gcloud auth list
. I have no idea why this particular job has a different one 🤯
one difference I can see is that the job that is failing is has a toleration set which means it’s running in a different node pool
but that node pool is using the same service account and has the same storage permissions according to gcloud console
I found something else though: previously I entered the bit before the @ of the service account email address under
serviceAccountName
. but when I checked the node pool I saw that this service account actually appears with the name `default`in the gcloud console. Then I checked
kubectl get serviceaccount
and saw that indeed there is a service account with name
default
in there. So, then I changed the values file so
serviceAccountName: default
but I’m getting this error:
Copy code
Error: rendered manifests contain a resource that already exists. Unable to continue with install: ServiceAccount "default" in namespace "default" exists and cannot be imported into the current release
j
the service name is just the bit before the @ in the service account email address right?
I’m not sure, I’m not as familiar with GCP
That error is because we try to create a service account with the name you provide. It’s also possible to use an existing one (
default
in your case), you just need to make sure it has adequate permissions
This is at the bottom of `values.yaml`:
Copy code
serviceAccount:
  create: true
These are the roles we attach to the service account if we create it https://github.com/dagster-io/dagster/blob/master/helm/dagster/templates/role.yaml
j
I got it to work. I kept the
serviceAccountName: default
and added
Copy code
serviceAccount:
  create: false
to the yaml file. so now it schedules this job on a pod with the correct service account. still not 100% what changed or why all of this is needed all of a sudden but hey.. it works 🙂 thanks!