In case it s helpful to someone here s a quick and dirty bas dagster #community-showcase

In case it’s helpful to someone, here’s a quick-an...

Alex Service

02/11/2022, 9:24 PM

In case it’s helpful to someone, here’s a quick-and-dirty bash script I created to let me selectively clean the outputs from successful jobs. Let me know if there’s a better approach! When to use: • If you’re working with large, uncompressed data in a dev or local environment and your disk space just isn’t big enough • You want to keep the logs from the runs, just not the IO from each op • You accept that a re-run of the job won’t be able to resume from an intermediate step When not to use: • If you’re uncomfortable with a random guy’s bash script running a recursive find and remove command 🙂 • If you’re in production or otherwise doing anything critically important • Please make sure nothing else lives in

$DAGSTER_HOME

! If you’re unsure, don’t run it How to use it: • Add to your

.bashrc

.bash_profile

, or other appropriate file • reload your terminal or

source

the file above •

dagprune name_of_your_op

How it works: The script checks for the existence of a

name_of_your_op.complete

file exists in the logs and if a corresponding IO directory also exists, then removes everything in the run’s directory except for the

compute_logs

directory. Technically, you could specify any op, but the intended use is for cleaning up outputs from successfully-completed runs (we leave failures alone, since we might actually want to re-run them from their point of failure)

Copy code

dagprune() {
	if [ -z "$DAGSTER_HOME" ]; then
		echo "Make sure to set env var DAGSTER_HOME before running this command"
		exit 1
	fi
	FINAL_OP="$1"
	for rundir in $DAGSTER_HOME/storage/*/; do
		FILE="$rundir/compute_logs/$FINAL_OP.complete"
		FINAL_OP_DIR="$rundir/$FINAL_OP"
		if [ -f "$FILE" ] && [ -d "$FINAL_OP_DIR" ]; then
			echo "Completed run found. Cleaning op outputs in $rundir"
			pushd $rundir > /dev/null
			find -type d -not \( -name 'compute_logs' -or -name '.' \) -exec rm -rf "{}" \;
			popd > /dev/null
		fi
	done
}

🎉 4

Open in Slack

Previous Next