[Dirvish] Backing up millions of files...

Eric Mountain em-dirvish-1 at nerim.net
Tue Feb 15 14:55:20 PST 2005

On Tuesday 15 Feb 2005 23:07, Brian Johnson spake thus:
> We have a need to backup 50000 directories that contain around 2 million
> files total.  Some small, some large.  The problem we have had with rsync
> in the past is that it uses up a tremendous amount of memory creating the
> file list in memory.  Does dirvish have a work around for this, or should I
> continue to look for another backup solution?  Does anybody have
> suggestions?  I've been creating a flat spot on my forehead smashing it
> against the proverbial desk for months now.  ANY help would be GREATLY
> appreciated!

Unfortunately Dirvish has no workaround for this kind of task.  It relies 
completely on rsync to do the grunt work, so it inherits rsync's weaknesses.

If you do want to go with an rsync-based backup solution (be it Dirvish or 
another package), then you might want to think about doing multiple backups 
which start lower down in the directory hierarchy.  "Divide and conquer". 
That way, rsync doesn't need to manage quite so many files in one go.  If 
your problem is that you have many directories across which you would split 
backups (e.g. you want to launch 1 rsync per home directory of each of your 
users), then you might want to think about generating the configuration files 
on the fly.

BTW, these are just ideas... nothing I've ever done!  I'm a home user ;-)

[ How much memory is rsync using for the file list?  Is it really very big by 
today's standards? ]

Also, you might be interested in these posts (you will notice the author is 
JW, the author of Dirvish):

O/w, if you are willing to look in the non-free direction, then there are SAN 
solutions which handle this kind of thing efficiently.  Of course, they cost 
money.  Lots of money.

Eric Mountain

More information about the Dirvish mailing list