:wave: Hello! I'm working on a Dagster integratio...
# ask-community
j
👋 Hello! I'm working on a Dagster integration for my company. I'm running an EMR pyspark job. The setup launcher uploads a zip of the project to the staging bucket before doing a spark-submit. I've got the AWS roles/policies all setup correctly to allow for this upload to the staging bucket but the upload is failing with an access denied exception. I'm 90% sure the issue is that the upload command is not specifying server-side encryption. CLI Example:
Copy code
$ aws s3 cp test.txt <s3://staging-bucket/test.txt>
upload failed: ./test.txt to <s3://staging-bucket/test.txt> An error occurred (AccessDenied) when calling the PutObject operation: Access Denied

$ aws s3 cp test.txt <s3://staging-bucket/test.txt> --sse AES256
upload: ./test.txt to <s3://staging-bucket/test.txt>
I was hoping there would be some way to pass in this configuration but it looks like additional arguments are not allowed: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-aws/dagster_aws/emr/pyspark_step_launcher.py#L274 Is there some other way to add configuration to have the upload specify server side encryption?
o
Hi @Jason Vorenkamp! it looks like your diagnosis is correct (and thanks for digging into it). I just filed an issue for this here: https://github.com/dagster-io/dagster/issues/8716. One option until that gets addressed would be subclassing the existing EMRPysparkStepLauncher, overriding the
_post_artifacts
method. Also, if you're interested in contributing a change for this behavior, we'd be happy to review that for you!
j
Thanks Owen! I'm not a Python developer myself but I'll see if I can't get something decent going...