https://dagster.io/ logo
Title
a

Apoorv Yadav

12/14/2022, 10:23 AM
Hi, can someone suggest which io-manager is best for passing file-objects from one op to another. Currently I am using the default io-manger fs_io_manger and its not letting me pass the file-obj. I tried mem_io_manager but it then ask to use fs_io_manger
:dagster-bot-responded-by-community: 1
d

Daniel Galea

12/14/2022, 10:33 AM
So you want to pass the output of one Op to the next? I think that works when you don't define an io_manager and it uses the default one.
a

Apoorv Yadav

12/14/2022, 10:33 AM
The default one doesn't supports passing file objects read in one op to another
m

Martin Picard

12/14/2022, 10:34 AM
if you definitely want to pass files use the blob storage io manager of the cloud provider you are using, eg dagster-aws has an s3 io manager, dagster-azure has adls2, etc
a

Apoorv Yadav

12/14/2022, 10:35 AM
Could you tell whether dagster-aws work with minio??
m

Martin Picard

12/15/2022, 1:05 PM
donno don't use aws or minio, but in theory yea minio should be able to drop in as s3, that's kinda the point
n

nickvazz

12/23/2022, 9:05 PM
@Apoorv Yadav, I was able to use
dagster-aws
with MinIO following this example in the docs https://docs.dagster.io/concepts/io-management/io-managers#custom-filesystem-based-io-manager being sure to add
AWS_ENDPOINT_URL
to
@io_manager(
    config_schema={
        "base_path": Field(str, is_required=True),
        "AWS_ACCESS_KEY_ID": StringSource,
        "AWS_SECRET_ACCESS_KEY": StringSource,
        "AWS_ENDPOINT_URL": StringSource,
    }
)
def s3_parquet_io_manager(init_context: InitResourceContext) -> PandasParquetIOManager:
    ...