[Dirvish] faster-dupemerge

Jason Boxman jasonb at edseek.com
Tue Mar 28 23:49:15 EST 2006

I've been using fdm nightly and it's been fun so far.

In case anyone's curious:

sarah:~# cat /usr/local/sbin/fdmcron

# Because you can't even do simple stuff in a crontab
DATE=$(date -d 'yesterday' +%Y%m%d)

/usr/local/sbin/faster-dupemerge --skip-compare \
  --find ! -path '*nebula-shared*' \
  -- /snapshot/*/$DATE/tree 1>/snapshot/fdm.$DATE 2>&1

I run it every morning at 8 a.m. when backups are supposedly finished if 
nothing runs over.  On a typical run I 'recover' less than 50MB, but 
initially I recovered 3-5GB when linking similar images.  Perhaps I should've 
used 'branches' for my images instead.  Ah well.  If you have several million 
files, you will need plenty of space where ever your TMPDIR is as GNU sort 
will use a ton of temporary space for sorting.

Have fun!


Jason Boxman
http://edseek.com/ - Linux and FOSS stuff

