https://dagster.io/ logo
#announcements
Title
# announcements
b

Ben Sully

05/14/2020, 4:36 PM
hey, i'm looking to make a start on https://github.com/dagster-io/dagster/issues/2458 and basing it off the
dagster_aws.emr
subpackage. i've started with the types module, but it looks like the
dagster_aws.emr.types
module uses both regular python enums and dagster enums - why is that? have i missed something in the docs?
oh, it looks like dagster enums are used for inputs and pyenums everywhere else
s

sandy

05/14/2020, 4:43 PM
hey Ben - are you planning to launch Databricks clusters as part of your jobs, or submit to existing Databricks clusters?
I believe emr/types mostly concerns configuring cluster launch, so if you're focused on the latter, you might not need to build an equivalent to it (the emr_pyspark_step_launcher doesn't currently rely on those types)
b

Ben Sully

05/14/2020, 4:46 PM
it'll probably be to new clusters to be honest; databricks' pricing is based on whether you submit jobs to a new or existing cluster (https://docs.databricks.com/dev-tools/api/latest/jobs.html#request-structure)
the types are fairly well defined by the API docs at least so it should be fairly simple to translate to dagster types
s

sandy

05/14/2020, 4:47 PM
"We suggest running jobs on new clusters for greater reliability." - interesting
b

Ben Sully

05/14/2020, 5:26 PM
yeah it's a bit counterintuitive but they at least have a single API endpoint (Runs Submit) to run a job on a new cluster, so it might be simpler than the EMR launcher anyways
s

sandy

05/14/2020, 6:04 PM
that sounds much simpler
2 Views