Is it possible to set the output of `s3_pickle_io_...
# ask-community
d
Is it possible to set the output of
s3_pickle_io_manager
in a different way other than
s3://<bucket>/dagster/storage/<job run id>/<op name>.compute
? I would like to write my files to S3 in the following manner
s3://<bucket>/year/month_day/<op name>.compute
. I guess that the <job run id> is used so that re-running the same Run over and over again will allow any downstream tasks to process the exact same data. This is similar to what I want but year/month_day would be a bit more human readible than a run ID. I am processing my data on EMR and I don't want to couple my Spark code to Dagster. Therefore, a year/month/day partition style would allow Spark to read data independent of Dagster.
j
Hey @Daniel Galea setting the bucket path like this isn’t something we support right now, but you could write a custom io manager that’s basically a copy of the
s3_pickle_io_manager
and change the logic that determines the path
d
Hey Jamie, okay thanks for the tip and code 🙂
286 Views