we've bumped into the limit of subdirectories in `...
# announcements
p
we've bumped into the limit of subdirectories in
/opt/dagster/dagster_home/storage/
on our box. Is there a recommended fix
Followup: if I'd like to simply schedule a cleanup script, is there a direct way to get a bash command into a dagster schedule, or do I have to write a solid that executes it in python?
d
Hi Paul - is using a postgres DB an option here? edit: This wouldn't help with this problem
If you want to delete a run, we conveniently just added a "dagster run delete" CLI command in the release yesterday
The dagster scheduler only executes pipelines, so to schedule it in dagster you'd need to write a solid, yeah.
s
Do you know which subdirectories are taking up the space? If they're the subdirectories that store the intermediate outputs of your solids, then I think your approach of scheduling a cleanup script is a good one
The dagster-shell package tries to make it easy to execute solids that run shell commands: https://docs.dagster.io/_apidocs/libraries/dagster_shell
a
what is stored in
$DAGSTER_HOME/storage
is the compute logs (stderr/stdout copies) and intermediates (the data passed solid to solid) if you dont care about the compute logs or re-executing starting from previous results you can safely clean the directories entirely. Other wise the compute logs are probably safest to clean - you can also disable the
ComputeLogManager
on your instance config if you dont want to persist these at all
m
can i ask how many directories that is on that box
p
@max 9160
@alex yeah we probably don't need all of the logging from more than a couple months ago so I'm setting up a regular deletion. We definitely do want to keep
ComputeLogManager
enables and persist logs for a few weeks
@sandy thanks! dagster-shell looks like the best path forward for me