Are there any concrete examples of using `dagster databricks dagster #ask-community

Are there any concrete examples of using `dagster_...

Ben Andersen-Waine

10/18/2022, 11:39 AM

Are there any concrete examples of using

dagster_databricks

and the

databricks_pyspark_step_launcher

? I can see one for EMR: https://github.com/dagster-io/dagster/blob/1.0.13/examples/with_pyspark_emr/with_pyspark_emr/repository.py It would be nice to see an example of the config required to pass to:

Copy code

"pyspark_step_launcher": databricks_pyspark_step_launcher.configured(        {
            # ??? 
        }
    ),

Ben Andersen-Waine

10/18/2022, 11:47 AM

This feels a bit like hard work 😄

Ben Andersen-Waine

10/18/2022, 11:47 AM

😅 1

chris

10/18/2022, 6:55 PM

Hey Ben - apologies, this is definitely one of the less-documented areas of the codebase. Are there specific questions you have about the config schema here?

Ben Andersen-Waine

10/19/2022, 8:33 AM

I was actually able to get it working in the end and to be fair the hints on which API calls to use to get valid values to plugin to some of the Databricks specific config params where really helpful. One small piece of feedback is that the param:

secrets_to_env_variables

is listed as optional but causes an error unless you pass at least an empty list. A request for the future would be a minimal worked example on producing a run on a new job cluster and on an existing persistent cluster.

Ben Andersen-Waine

10/19/2022, 8:34 AM

I used this to get the gist: https://gist.github.com/klesouza/61267b9a38effe0f5baea894393c98e6 And would like something like this in the demo repo: https://github.com/dagster-io/dagster/blob/1.0.13/examples/with_pyspark_emr/with_pyspark_emr/repository.py

Ben Andersen-Waine

10/19/2022, 8:36 AM

All that aside I'm now running databricks jobs in AWS from my workstation. Moving between a testing loop and then actually running a job on sample data is a workflow game changer. Thanks Dagster team 😄

chris

10/19/2022, 4:53 PM

this is excellent feedback, and really really appreciate you linking all this information. Glad it all worked out 🙂

Open in Slack

Previous Next