[Dirvish] Copying banks

Shawn Perry redmopml at comcast.net
Wed Jan 20 18:50:46 UTC 2010


On Wed, Jan 20, 2010 at 9:43 AM, Keith Lofstrom <keithl at kl-ic.com> wrote:
> On Wed, Jan 20, 2010 at 03:31:19PM +0100, Bernd Haug wrote:
>> What do you use when you need to move whole banks to other hosts (or
>> other file systems)?
>>
>> rsync -e ssh -aAHXx /mount-point root at remote:/new-mountpoint is very
>> slow (due to hard link preservation, I presume).
>>
>> Just dd'ing is out of the question. (E.g. because, in my case, the new
>> device is slightly (i.e., a few MiB, but still) smaller.)
>
> I looked for tools fo this a few years ago, and did not find
> anything.  I like to keep old images - the expense of expire (CPU
> time and disk "wearout" and chances for error) is not normally
> worth the extra space gained.  However, this results in lots of
> images, and some files with hundreds of hard links.  If I am using
> ext[2,3,4], and run out of inodes ... disaster.  I have a file
> system that is partly full of images and heavily hardlinked.
> Copying the data to another file system built with a proper number
> of inodes involves too much data movement, because the known
> copying processes (and rsync at the time I looked) do not
> efficiently copy hardlinked files.  Perhaps that is better now.
>
> Something that copied the data once, and kept track of hardlinks,
> without huge tables somewhere, might need to be aware of the
> underlying structure of the of the filesystem to do the job
> efficiently.  It may be necessary to keep track of the hardlinks
> going the other direction, from data inode to directory entry.
>
> Beyond that, a simple copy might not be as efficient as keeping
> track of the actual file data, and merging hardlink trees where the
> data permits it.  That would make the filesystem copies much more
> compact than the original source filesystem, and help with keeping
> evolving branches compact.  This would be helpful for rsync-based
> backup, but a generally useful tool for active file systems, because
> in some cases you might want the two hardlink trees to evolve
> separately.  Evolution does not need to happen with backups.
>
> If you can invent an efficient way to (with finite time and finite
> RAM) copy large hardlinked data trees, and especially if you can
> (optionally) merge the hardlinks of identical files,  you would be a
> hero to me.  I just don't know whether there is a good way to do it.
>
> Keith
>
> --
> Keith Lofstrom          keithl at keithl.com         Voice (503)-520-1993
> KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
> Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs
> _______________________________________________
> Dirvish mailing list
> Dirvish at dirvish.org
> http://www.dirvish.org/mailman/listinfo/dirvish
>

Use tar to copy the files.  There are several disk de-duplication
utilities out there as well.


More information about the Dirvish mailing list