The cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability.

dagster

Is it possible to set the output of `s3_pickle_io_manager` in a different way other than `s3://&lt;bucket&gt;/dagster/storage/&lt;job run id&gt;/&lt;op name&gt;.compute`? I would like to write my files to S3 in the following manner `s3://&lt;bucket&gt;/year/month_day/&lt;op name&gt;.compute`.

I guess that the &lt;job run id&gt; is used so that re-running the same Run over and over again will allow any downstream tasks to process the exact same data. This is similar to what I want but year/month_day would be a bit more human readible than a run ID. I am processing my data on EMR and I don't want to couple my Spark code to Dagster. Therefore, a year/month/day partition style would allow Spark to read data independent of Dagster.

Hey <@U0497UXR5TN> setting the bucket path like this isn’t something we support right now, but you could write a custom io manager that’s basically a copy of the `s3_pickle_io_manager` and change the logic that determines the path

here’s the source code if you’re interested in that approach <https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-aws/dagster_aws/s3/io_manager.py>

Hey Jamie, okay thanks for the tip and code :slightly_smiling_face: