Stephen Bailey
04/01/2022, 3:54 PMconfigs
vs op inputs
. i find myself wanting to write python, where everything is an argument to the op function, with sensible defaults.
for example, i'm trying to write a function that lists objects in a bucket -- bucket
-- that are under a certain object_id
. here's how i've written it now:
@op(config_schema={"bucket": Field(str, default_value="my-bucket")})
def get_object_files(context, object_id):
bucket = s3.Bucket(context.op_config["bucket"])
for object in bucket.objects.filter(Prefix=f"{object_id}/"):
... do something
but should bucket
be an input? why not? should i make it an input and have a setup
op that generates the value? Should object_id
be a config, so that users can parameterize it from the launchpad?
any thoughts would be appreciated!owen
04/01/2022, 4:11 PMget_object_files.configured({"bucket": "some-cool-bucket"})
and using that in a job. Another way to put it is that there are (kinda) three times that you might want to modify a value used by an op: when you're setting up your job (i.e. underneath a job decorator), in the launchpad, and at runtime. Config is generally better for the first 2, but doesn't work at all for the final one.Stephen Bailey
04/01/2022, 4:14 PMowen
04/01/2022, 4:15 PMStephen Bailey
04/01/2022, 4:16 PM