[Dirvish] Dirvish backup controlled by client?

Keith Lofstrom keithl at kl-ic.com
Wed Dec 19 16:08:55 UTC 2007


On Wed, Dec 19, 2007 at 02:44:26PM +0000, Dave Howorth wrote:
> Dave Howorth wrote:
> > Frank Lüken wrote:
> >> the main Problem is, we have only a few free ip's and many laptops.
> > 
> > So perhaps another way to solve your dirvish question is to make this
> > problem go away. Just switch to using private ip addresses (i.e. RFC 198 :)

The 10.xxx.xxx.xxx address space should allow for about 16 million 
laptops.  the 192.168.xxx.xxx address space cuts that down to a mere
64 thousand.  If the laptops are anonymized, though, you will not be
able to properly back them up - it is logically impossible when you
think about it.  Dirvish is very useful, but it cannot help you
transcend logical impossibilities!

DHCP can assign specific names to devices via the MAC address (my DNS
server is set up this way).  So you can assign a specific name to a
particular device with DHCP and DNS, even out of a limited pool of IP
addresses.  If you want to back up specific machines and associate
them with specific backup datasets, you will have to have a robust
way to name them.  MAC addresses can be spoofed, but if you use keyed
SSH as the transport layer, then the whole process cannot be spoofed.

The remaining problem is scheduling.  If your users are connecting
only transiently, it is difficult for a scheduler to predict when 
there will be enough time to complete a backup, so it knows whether
to start one or not.  It is even difficult to estimate the amount of
time a rsync backup will take, until you actually run rsync. 
Especially if multiple users show up at  You certainly can't expect
the users to make an intelligent estimate - the least intelligent
users are the ones most in need of frequent backups, yes? 

So backups are best initiated when there is a long time period available
(overnight is good) and the backup server can sequence the clients one
after another, so the server isn't thrashing disks.  The way dirvish
and rsync does it now.

It would be nice if clients could do backups instantaneously, at any
time, and could be trusted to do so correctly, but you can see the
logical reasons why this is very difficult.  HOWEVER,  with USB backup
drives getting cheap, the best way to accomplish the asynchronous
backup goal might be a two step process:

 1) back up a specific laptop to a specific USB backup device,
    asynchronously.  A local version of dirvish on the client can
    do this.

 2) move the USB backup device to a stationary machine, and let
    the backup to the main server occur on schedule.

I don't know how to do this specifically, but perhaps somebody here
can figure out how to do this robustly and securely.  A small pile
of USB drives has got to be cheaper than the network and disk
bandwidth to get everyone backed up over a coffee break.

Keith

-- 
Keith Lofstrom          keithl at keithl.com         Voice (503)-520-1993
KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs


More information about the Dirvish mailing list