Hi ya ll Have an io manager best practice question I m sourc dagster #ask-community

Hi ya'll. Have an io manager best practice questio...

Alex Martin

08/21/2023, 8:11 PM

Hi ya'll. Have an io manager best practice question. I'm sourcing a handful of datasets with dagster defined assets and would like to write the extracts to their respective directories, based on source (likely derived my group_name or key_prefix), on a local NAS. Would the recommended practice here be: 1. Hardcode the paths in the asset definitions 2. Create individual FileSystemIOManagers for each 3. Create a ConfigurableIOManager that can take the group_name / key_prefix as a variable to add to the basepath that all of those directories would share 4. Something else? I'd say my understanding of IO managers is still maturing blob teefs. Pretty loose at the moment. Thanks in advance!

🤖 1

jamie

08/21/2023, 8:30 PM

Unless each asset needs to be stored in a unique way, I wouldn’t make a FileSystemIOManager for each, at that point you might as well write the loading and storing code in the assets themselves. 3 is a good option, and is what we do in most of our io managers

Alex Martin

08/21/2023, 8:35 PM

Thanks @jamie makes sense! I'll have a look at the ConfigurableIOManager documentation and proceed with #3. Do you by chance have a top of mind example of this kind of io manager that I could use as a base to get started? No worries if not.

jamie

08/21/2023, 8:44 PM

most of our file system IO managers use a base class called UPathIOManager, that handles most of the file path creation logic and partition handling - here’s an example of the AWS S3 IO manager that uses UPathIOManager https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-aws/dagster_aws/s3/io_manager.py#L25 here’s another one from an example that doesn’t use the UPathIOManager, https://github.com/dagster-io/dagster/blob/master/examples/project_fully_featured/project_fully_featured/resources/parquet_io_manager.py

Alex Martin

08/21/2023, 8:56 PM

Awesome, thank you!

Open in Slack

Previous Next