Question regarding IOManagers and FileManagers. In...
# announcements
t
Question regarding IOManagers and FileManagers. In the docs it says that the FileManager API is deprecated, but it seems that it fulfills some functionality that the IOManager with its more opinionated interface doesn't. For example, it seems like the FileManager allows me to use it to read or write arbitrary data within the bounds of a solid without requiring that it be propagated to or from other solids in the pipeline. The IOManager feels as though it's primarily concerned with passing object state between solids and less interested in handling operations such as uploading a CSV to S3, etc. Is there something that I'm missing?
👀 3
s
Hey Tobias good question. We weren’t totally happy with FileManager so we were working on de-emphasizing it. (We actually didn’t know if anyone else was using it to be honest!) You make the reasonable point that we don’t have a good successor to it. In light of that we’ll un-deprecate it for our release next week.
t
Thanks, sorry to make you keep more code 🙂
Revisiting this, I'm working on reimplementing a daily uploads folder within a pipeline and figured I should write it as a FileManager. Curious if you know if anyone has gone down that road already? Specifically I create a directory on local disk named from the current date (e.g. 20210315) and then as the last stage of my pipeline I write the contents of the folder to S3. The data that goes into the folder is a few different types of thing. Some of it is CSV files, there are also some output files from mongodump, and there will also be some .tar.gz files. Any thoughts on best practice given the mechanisms that Dagster has available?
l
@João Luiz Carabetta
Hi @Tobias Macey, we're trying to accomplish something similar to you. We would like to upload our files to GCP. Could you shed some light in this task for us? Did you manage to upload the files to S3 using IOManagers? Or better to use FileManagers even though they seem to be deprecated/deprecating?
t
Hi Laura, I ended up punting on that work for the time being. Right now I'm just using the s3 resource from dagster-aws to upload the files. I'll have to revisit that again when I have the time, though that's likely to be on the order of a month or two from now.