How can I make an optional op config schema field?
# ask-community
a
How can I make an optional op config schema field?
🤖 1
z
You can use
is_required=False
or make it `Noneable`:
Copy code
from dagster import Field, Noneable, op, OpExecutionContext

@op(
  config_schema={"optional_field": Field(str, is_required=False,
                 "noneable_field": Field(Noneable(str), is_required=True, default_value=None)}
)
def something(context: OpExecutionContext):
    context.op_config["optional_field"] # will throw KeyError if optional_field not included in run config
    assert context.op_config["noneable_field"] == None
a
Cool thanks! Didn't know I could do fields. Was just using standard typing and ended up settling on
Noneable
j
if you’re using the newer Pythonic/Pydantic config system, you would do it like this
Copy code
from pydantic import Field 
from dagster import Config 
from typing import Optional

class MyConfig(Config):
     optional_field: Optional[str] = Field(default=None)
note that here
Field
is imported from
pydantic
not
dagster
D 1
a
Cool. I have never used Pydantic. Is that the preferred way of making standard configs?
j
Pydantic config was fully released in 1.3 and is now our recommended approach, but the
config_schema
dictionaries are still fully supported, so if that works better for your use case feel free to use it!
What Zach wrote will totally work, just wanted to include the Pydantic method since there are two ways you could do this fwiw with Pydantic config you provide it to the asset like this
Copy code
from pydantic import Field 
from dagster import Config, asset
from typing import Optional

class MyConfig(Config):
     optional_field: Optional[str] = Field(default=None)

@asset
def my_asset(config: MyConfig):
     if config.optional_field is not None:
         ...
daggy love 2
❤️ 1
D 1
🌈 1
🤖 1
a
Is that a key word arg? If we wanted the context too, does order matter?
j
yeah to pipe the config through correctly, the parameter has to be named
config
and have the right type annotation. You can use the context too!
Copy code
@asset
def my_asset(context, config: MyConfig):
     if config.optional_field is not None:
         ...
would be totally fine I don’t think the order matters (worth verifying with a small sample asset though)
a
Thanks! will take a look
@jamie how would this look if I had this config for an op and wanted a job to pass through the config to the op?
j
If you want to specify config at execution time, this is the relevant docs section https://docs.dagster.io/concepts/configuration/config-schema#specifying-runtime-configuration. If you want to hard-code some configuration you can do something like this
Copy code
from pydantic import Field
from dagster import Config, job, op, RunConfig
from typing import Optional

class MyConfig(Config):
     optional_field: Optional[str] = Field(default=None)

@op
def my_op(config: MyConfig):
     if config.optional_field is not None:
          ...

@job(
    config=RunConfig(ops={"op_name": MyConfig(optional_field="foo")})
)
def my_job():
     my_op()
a
If I don't know the run config values and want to use dagit to execute, do I need to specify the config, or just pass through the config in the launchpad?
j
you can just specify the config in the launchpad https://docs.dagster.io/concepts/configuration/config-schema#dagit
a
got it, thanks. I wasn't sure if I also had to specify in the code
I am trying to materialize an asset from an op in a job. That asset has a config associated to it, but when I try reloading my code location I get a TypeError
asset_1() missing 1 required positional argument: 'config'
. I am using the
materialize()
function in my op
j
materialize
should generally just be used for testing assets (ie in unit tests or integration tests) and not called from within ops. What is the larger goal of materializing the asset within the op? there is likely another way to accomplish the goal
a
We currently use SOLR as a cache that we write to from Dagster. In order to update the schema of this I originally created a job with ops to complete the schema updates. This leaves residual objects in our SOLR cache, that can be used as a backup. Eventually, after an independent data review, I want to cleanup those residual objects. I thought Dynamic Partitions would be good for this, because I can cleanup by partition_key with a downstream asset or job, while keeping those update objects separated. The way I create the partition_key is to use a job
run_config
that will make some api calls and get the partition key, and then materialize the asset with that partition_key
j
ok, i think i see. My recommendation in this case would be to have the job that makes the partition key, and then maybe a run_status_sensor that materializes the assets when that job completes. It’s a bit of indirection, but i think that should accomplish what you want
a
How would the sensor get the partition key from job output?
Now I am just thinking have the job add a dynamic partition key, then the user can just materialize that partition key manually. But that is kind of what I am doing now, just calling
materialize
instead of someone manually using the UI to materialize a partition. It just creates an ephemeral run, not sure of the implications that has, but I think an additional thread is held because the Op that calls materialize stays as running while the asset is materializing. That is the only real downside. I could change it to async, but maybe for one-off uses this is ok
the other thing I tried using was the graphql client, but I didn't see how to add a partition_key to a graphapi submit_job_execution