i'm having a bit more fun with external step launc...
# announcements
b
i'm having a bit more fun with external step launchers today 🙂 my current issue is that i need some secrets as part of config in order to launch the step, but they're not necessary when it's running remotely. for example, my resource config looks something like...
Copy code
@resource(
    {
        'databricks_host': Field(
            StringSource,
            is_required=True,
            description='Databricks host, e.g. <http://uksouth.azuredatabricks.com|uksouth.azuredatabricks.com>',
        ),
        'databricks_token': Field(
            StringSource, is_required=True, description='Databricks access token',
        ),
    }
)
def databricks_pyspark_step_launcher(context):
    ...
then i use something like this when running it:
Copy code
resources:
  pyspark_step_launcher:
    config:
      databricks_host: <http://uksouth.azuredatabricks.net|uksouth.azuredatabricks.net>
      databricks_token:
        env: DATABRICKS_TOKEN
which launches the job just fine, but when it gets to executing the step remotely the
DATABRICKS_TOKEN
env var isn't present (and doesn't need to be) so the step fails. is there an idiomatic solution to this? maybe passing credentials in some other way a'la boto?
s
hey Ben - I don't believe we yet have idioms on this. would it be difficult to supply some value for the DATABRICKS_TOKEN env var in the remote environment? it could just be the empty string. another solution, perhaps not the most elegant, would be to fill in this config in the step launcher. i.e. when constructing the StepRunRef in
step_context_to_step_run_ref
, instead of passing the
environment_dict
directly create a copy of it that has "databricks_token" filled in
b
supplying an empty value for
DATABRICKS_TOKEN
should be easy enough, not sure why i didn't think of that 😅