Daniel Galea

11/10/2022, 1:24 PM
Hi 👋 I am looking into the Spark integration documentation so that I can launch spark jobs on AWS EMR. Do I understand correctly that there is no OP which will launch an EMR cluster and then run a step on it, and that instead the EMR cluster should already be running and Dagster can add / launch a step on that cluster? I am looking for something similar to Airflow's EmrCreatejobFlowOperator.
I guess I am answering my own question here but looking at the code here, we can call EmrJobRunner().run_job_flow(). Is this the correct way to use this class and method? 🙂


11/10/2022, 9:05 PM
Hey Daniel - good code-spelunking. Yes, that is correct!
:rainbow-daggy: 1

Daniel Galea

11/11/2022, 10:07 AM
Great, thanks! 🙂