Hi all, do you know by any chance, if the <env-not...
# announcements
s
Hi all, do you know by any chance, if the env-notation is not supported for pyspark? Somehow it doesn’t pick up my envs properly. I tried several times. If I replace the env with the real values it works. Or am I doing something wrong? 🤔
a
edit: sandy’s explanation below is likely the issue
Somehow it doesn’t pick up my envs properly
What exactly are you observing? One guess would be the env vars are not available where ever the final step execution process is occurring, which can be a different machine than where the dagit server and the run coordinating processes are happening
s
For the env notation to work, the config schema needs to use a
StringSource
type on each of the fields that may be parameterized with envs. I believe that the fields you're pointing to are defined "permissively", i.e. we don't define the individual config fields, we just allow users to define their own. Because of that, we don't have the opportunity to make these fields `StringSource`s. Unless I'm missing something, unfortunately I don't see an easy way to get around this without making some changes to the config system.
One potential way to address it in the dagster Spark package would be to explicitly include these configs in the config schema
a
oh good catch Sandy - I didn’t consider that the schema could be
Permissive
s
Ah OK I see. Thanks for explaining guys. So if I would use a predefined as
extraJavaOptions
that would work because it’s a String? Meaning I would add these parameters here for my deployment (although this is autogenerated, but I guess I could hard code these?). Or otherwise would you want to add them here?
s
Adding them to parse_spark_configs.py would be the right approach, because configs_spark.py is auto-generated from parse_spark_configs.py.
s
but I can’t do that locally on my environment to get around this, I need to do a PR, correct?
s
alas, that is correct. you could alternatively copy that code into your own environment