Is there a way to set the number of partitions sav...
# ask-community
a
Is there a way to set the number of partitions saved? For instance I only want the last 7 days of partitions
j
hey @Aaron T to clarify - do you want to load the last 7 days of partitions into a downstream asset? or have it so that only the last 7 days of data are saved in your external storage?
a
Hi @jamie the former. Only the last 7 days of data is saved in external storage
j
ok! if you want a way to tell dagster that only the last 7 days of data is in storage (and therefore that the partition should be considered “missing” or some other status), we don’t have a way to support that now. you could open a github issue for it though!
a
Sure, I will take a look at that. If I wanted to look into opening a PR, would you be able to point me in the right direction of where dagster handles partitions? Maybe I could add something like a retention policy.
j
i think all of the partition files are in this directory https://github.com/dagster-io/dagster/tree/master/python_modules/dagster/dagster/_core/definitions @claire could you point aaron to any other files important to partitions if they want to open a PR?
c
Hi Aaron! Enabling a retention policy for partitions is a pretty extensive change, and I think we'd want to deliberate carefully before we enable something like this. Nowhere else in dagster's system do we automatically wipe event log data (which is responsible for populating which partitions materialized, what the status of runs are, etc.) and I think it could be tricky for users if some desired data becomes irrecoverable. Currently we put the onus on users to manage deleting their own data if desired. I'd recommend filing an issue for now so we can track it in the backlog as I'm not sure we'd want to explicitly support this functionality, so don't want to send you on a rabbithole for building out this functionality 🙂 but happy to review or point you in the right direction for future contributions.
In the meantime, one option is you could have a daily schedule that finds runs for partitions older than 7 days and deletes them, similar to this discussion here.
a
Good to know. Thank you. I will take a look at the discussion