Are there any concrete examples of using `dagster_...
# ask-community
b
Are there any concrete examples of using
dagster_databricks
and the
databricks_pyspark_step_launcher
? I can see one for EMR: https://github.com/dagster-io/dagster/blob/1.0.13/examples/with_pyspark_emr/with_pyspark_emr/repository.py It would be nice to see an example of the config required to pass to:
Copy code
"pyspark_step_launcher": databricks_pyspark_step_launcher.configured(        {
            # ??? 
        }
    ),
This feels a bit like hard work 😄
😅 1
c
Hey Ben - apologies, this is definitely one of the less-documented areas of the codebase. Are there specific questions you have about the config schema here?
b
I was actually able to get it working in the end and to be fair the hints on which API calls to use to get valid values to plugin to some of the Databricks specific config params where really helpful. One small piece of feedback is that the param:
secrets_to_env_variables
is listed as optional but causes an error unless you pass at least an empty list. A request for the future would be a minimal worked example on producing a run on a new job cluster and on an existing persistent cluster.
All that aside I'm now running databricks jobs in AWS from my workstation. Moving between a testing loop and then actually running a job on sample data is a workflow game changer. Thanks Dagster team 😄
c
this is excellent feedback, and really really appreciate you linking all this information. Glad it all worked out 🙂