https://dagster.io/ logo
#dagster-support
Title
# dagster-support
j

Jason Vorenkamp

07/01/2022, 8:22 PM
👋 Hello! I'm working on a Dagster integration for my company. I'm running an EMR pyspark job. The setup launcher uploads a zip of the project to the staging bucket before doing a spark-submit. I've got the AWS roles/policies all setup correctly to allow for this upload to the staging bucket but the upload is failing with an access denied exception. I'm 90% sure the issue is that the upload command is not specifying server-side encryption. CLI Example:
Copy code
$ aws s3 cp test.txt <s3://staging-bucket/test.txt>
upload failed: ./test.txt to <s3://staging-bucket/test.txt> An error occurred (AccessDenied) when calling the PutObject operation: Access Denied

$ aws s3 cp test.txt <s3://staging-bucket/test.txt> --sse AES256
upload: ./test.txt to <s3://staging-bucket/test.txt>
I was hoping there would be some way to pass in this configuration but it looks like additional arguments are not allowed: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-aws/dagster_aws/emr/pyspark_step_launcher.py#L274 Is there some other way to add configuration to have the upload specify server side encryption?
o

owen

07/01/2022, 8:50 PM
Hi @Jason Vorenkamp! it looks like your diagnosis is correct (and thanks for digging into it). I just filed an issue for this here: https://github.com/dagster-io/dagster/issues/8716. One option until that gets addressed would be subclassing the existing EMRPysparkStepLauncher, overriding the
_post_artifacts
method. Also, if you're interested in contributing a change for this behavior, we'd be happy to review that for you!
j

Jason Vorenkamp

07/01/2022, 9:04 PM
Thanks Owen! I'm not a Python developer myself but I'll see if I can't get something decent going...