Jacob Marcil
08/23/2023, 2:36 PM@io_manager(required_resource_keys={"s3_bucket", "s3"})
def common_bucket_s3_io_manager(init_context):
"""
A version of the s3_csv_io_manager that gets its bucket from another resource.
"""
return s3_csv_io_manager(
build_init_resource_context(
config={"s3_bucket": init_context.resources.s3_bucket},
resources={"s3": init_context.resources.s3},
)
)
Where s3_bucket
and s3
are required by this new ressource so, If I configured all the right ressources in my project/resources/__init__.py
file I’ll be able to make that work.
Is there a way to specify which ressources to use instead of hardcoding them?
I’m asking because I need to write to 2 different buckets that required 2 differents AWS credentials in different jobs.
If s3
and s3_bucket
is taken but the first one, how do I make the second IO_MANAGER work?
I know I could create a new io manager let’s say called my_new_other_common_bucket_s3_io_manager
like this
@io_manager(required_resource_keys={"my_new_s3_bucket", "my_new_s3_client_ressource"})
def my_new_other_common_bucket_s3_io_manager(init_context):
"""
A version of the s3_csv_io_manager that gets its bucket from another resource.
"""
return s3_csv_io_manager(
build_init_resource_context(
config={"s3_bucket": init_context.resources.my_new_s3_bucket},
resources={"s3": init_context.resources.my_new_s3_client_ressource},
)
)
But this feels wrong… Is there a better way?jamie
08/23/2023, 2:41 PMclass CommonBucketS3IOManager(ConfigurableIOManager):
s3: S3Resource
....
defs = Defintions(
assets=[...],
resources={
"io_mgr_1": CommonBucketS3IOManager(s3=S3Resource(<some_config>)),
"io_mgr_2": CommonBucketS3IOManager(s3=S3Resource(<other_config>))
}
)
Jacob Marcil
08/23/2023, 3:00 PM@io_manager
decorator and return a class object out of it.
I’ll try that 🙂Caelan Schneider
08/23/2023, 4:20 PMjamie
08/23/2023, 4:31 PM