https://dagster.io/ logo
Title
u

王昊

05/16/2023, 9:12 AM
If I use
dagster_aws.s3_pickle_io_manager
as the
io_manager
, the return values of each operation will be stored in S3. Over a long period of usage, this might occupy a significant amount of storage. Is there a conventional solution from the community for this issue, or should I manage it myself?
t

Tim Castillo

05/16/2023, 3:05 PM
You would have to manage this yourself. What I've done before is add a TTL via S3's Object Expiration feature. If the data you're passing between assets is temporary/ephemeral, another option would be to use the
mem_io_manager
instead of the S3 one. which would store the data in-memory if it fits.