[Dirvish] Copying banks
redmopml at comcast.net
Wed Jan 20 18:50:46 UTC 2010
On Wed, Jan 20, 2010 at 9:43 AM, Keith Lofstrom <keithl at kl-ic.com> wrote:
> On Wed, Jan 20, 2010 at 03:31:19PM +0100, Bernd Haug wrote:
>> What do you use when you need to move whole banks to other hosts (or
>> other file systems)?
>> rsync -e ssh -aAHXx /mount-point root at remote:/new-mountpoint is very
>> slow (due to hard link preservation, I presume).
>> Just dd'ing is out of the question. (E.g. because, in my case, the new
>> device is slightly (i.e., a few MiB, but still) smaller.)
> I looked for tools fo this a few years ago, and did not find
> anything. I like to keep old images - the expense of expire (CPU
> time and disk "wearout" and chances for error) is not normally
> worth the extra space gained. However, this results in lots of
> images, and some files with hundreds of hard links. If I am using
> ext[2,3,4], and run out of inodes ... disaster. I have a file
> system that is partly full of images and heavily hardlinked.
> Copying the data to another file system built with a proper number
> of inodes involves too much data movement, because the known
> copying processes (and rsync at the time I looked) do not
> efficiently copy hardlinked files. Perhaps that is better now.
> Something that copied the data once, and kept track of hardlinks,
> without huge tables somewhere, might need to be aware of the
> underlying structure of the of the filesystem to do the job
> efficiently. It may be necessary to keep track of the hardlinks
> going the other direction, from data inode to directory entry.
> Beyond that, a simple copy might not be as efficient as keeping
> track of the actual file data, and merging hardlink trees where the
> data permits it. That would make the filesystem copies much more
> compact than the original source filesystem, and help with keeping
> evolving branches compact. This would be helpful for rsync-based
> backup, but a generally useful tool for active file systems, because
> in some cases you might want the two hardlink trees to evolve
> separately. Evolution does not need to happen with backups.
> If you can invent an efficient way to (with finite time and finite
> RAM) copy large hardlinked data trees, and especially if you can
> (optionally) merge the hardlinks of identical files, you would be a
> hero to me. I just don't know whether there is a good way to do it.
> Keith Lofstrom keithl at keithl.com Voice (503)-520-1993
> KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
> Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs
> Dirvish mailing list
> Dirvish at dirvish.org
Use tar to copy the files. There are several disk de-duplication
utilities out there as well.
More information about the Dirvish