Does anyone use the fs_io_manager (for speed), then move everything to S3 after for traceability?
c
chris
04/25/2022, 3:52 PM
would you want to do this within the same run? Or are you imagining that in certain scenarios you want to write to s3 and in others write to fs
g
George Pearse
04/25/2022, 4:00 PM
Not quite sure, I had been using the s3 io manager without questioning it, but I'm currently debugging a pipeline and realising that the uploads and downloads take a lot of time. Wondering if there was a way to do both.
I can imagine a pipeline with a final op that uploads the related fs files to S3. That I can switch on and off via the job config.
c
chris
04/25/2022, 4:01 PM
Yup that's what I was imagining as well. You could make the output of the last op optional, so it only fires if you "switch it on" via config
❤️ 1
You could also create a schedule/sensor that bulk uploads to s3 on some cadence
❤️ 1
g
George Pearse
04/25/2022, 4:22 PM
The file system manager might actually be slower ??? Seems bizarre