https://dagster.io/ logo
#dagster-support
Title
# dagster-support
d

Daniel Galea

12/13/2022, 1:37 PM
Is it possible to set the output of
s3_pickle_io_manager
in a different way other than
s3://<bucket>/dagster/storage/<job run id>/<op name>.compute
? I would like to write my files to S3 in the following manner
s3://<bucket>/year/month_day/<op name>.compute
. I guess that the <job run id> is used so that re-running the same Run over and over again will allow any downstream tasks to process the exact same data. This is similar to what I want but year/month_day would be a bit more human readible than a run ID. I am processing my data on EMR and I don't want to couple my Spark code to Dagster. Therefore, a year/month/day partition style would allow Spark to read data independent of Dagster.
j

jamie

12/13/2022, 5:47 PM
Hey @Daniel Galea setting the bucket path like this isn’t something we support right now, but you could write a custom io manager that’s basically a copy of the
s3_pickle_io_manager
and change the logic that determines the path
d

Daniel Galea

12/14/2022, 8:15 AM
Hey Jamie, okay thanks for the tip and code 🙂
137 Views