Ben Sully
05/18/2020, 2:41 PM@resource(
{
'databricks_host': Field(
StringSource,
is_required=True,
description='Databricks host, e.g. <http://uksouth.azuredatabricks.com|uksouth.azuredatabricks.com>',
),
'databricks_token': Field(
StringSource, is_required=True, description='Databricks access token',
),
}
)
def databricks_pyspark_step_launcher(context):
...
then i use something like this when running it:
resources:
pyspark_step_launcher:
config:
databricks_host: <http://uksouth.azuredatabricks.net|uksouth.azuredatabricks.net>
databricks_token:
env: DATABRICKS_TOKEN
which launches the job just fine, but when it gets to executing the step remotely the DATABRICKS_TOKEN
env var isn't present (and doesn't need to be) so the step fails. is there an idiomatic solution to this? maybe passing credentials in some other way a'la boto?sandy
05/18/2020, 3:55 PMstep_context_to_step_run_ref
, instead of passing the environment_dict
directly create a copy of it that has "databricks_token" filled inBen Sully
05/18/2020, 4:09 PMDATABRICKS_TOKEN
should be easy enough, not sure why i didn't think of that 😅