trimtrees.pl is a useful script to save space on unix file-systems. It works by looking at all the files in a list of directories and replacing duplicate files by hard linking to the first copy of the file. This has the advantage that the file will still appear at both locations in the file-system but only use up one place on the disk. This could lead to problems later if you modify one of the files and aren’t expecting the other to change, but for saving space from my static back files it’s ideal.
Below is example of saving space, one of my backup drives which contains multiple snapshots of my work became 100% full. trimtrees.pl needs a list of directories to trawl through so here I used the * to provide a list. (I could have also listed tjhe directories e.g. “BACKUP-JAN15 BACKUP-MAR15” etc.
prompt> perl trimtrees.pl *
tlds[6]cur[35]uniq[789_669]fils[3_613_975]spcused[528_432_303_246]saved[653_207_312_012]
DONE
I’m really happy this freed up a whole lot of space.
prompt> df -h
Filesystem Size Used Avail Use%
/dev/sdb1 1.4T 494G 812G 38%
Next question
What I would like to know next is how compatible this hard linking can be with rsync, I guess it’s not that compatible since it probably changes the time stamp to that of the oldest date on a unique file (I didn’t check this).
Filed under: Computers, UNIX | Tagged: backup, Debian, directories, duplicate files, hardlinking, hash, Linux, md5sum, space saving ideas, system administration, Tips, trimtrees, UNIX |
tamam la Thank you for posting this blog. Rural Mexico is a very beautiful place I hope the recent violence is far from where you are.