[Dirvish] checksum=0 may be causing inaccurate backups!!!

Keith Lofstrom keithl at kl-ic.com
Thu Jun 21 06:21:42 UTC 2007


This may be a weird security problem - or it may be an artifact of
some updated libraries that rsync is poor at catching.  .  It is
late, and I am too tired to work on this tonight, but I thought I
would give our European colleagues some time to think about this
before the weekend.

Today, I ran some experiments to see how much slower dirvish would 
get if I ran it with the checksum=1 option, and how many files it 
would pick up that weren't picked up by directory metadata and size
with the default checksum=0 option.

The slowdown using checksum=1 is 3x to 6x .  A lot slower. 

I have 5 machines recently converted to Scientific Linux 5 (like
CentOS5, a Red Hat Enterprise 5 clone).  They are set up for 
automated yum upgrades every night, and change fairly actively
because this is a new distro.

The checksummed dirvish run showed 475 files with identical
dates/ownership/size and different checksums.  They are mostly
in /usr/lib and /usr/bin .  I did some hex dumps on some
backed-up versions of the files and compared them to hex dumps
of the orginals, and many bytes are different.  I don't understand
ELF format, but most of the changes were up front or in back,
about 3 out of every 12 bytes - this may be the ELF symbol table
for the binary or library.  If these files were unchanged, but
re-linked to updated libraries, it would make sense that they
would stay the same length and have bytes that change line this.

There were no long runs of changed bytes.  chkrootkit 0.47 gave
my systems a clean bill of health.  This is probably not enemy
action.

I will look at the yum update logs and see if these binaries and
libraries correspond to recent updates.  Some look familiar.  It
may be that somehow the updates got made with older dates or 
something, or that yum is not changing the dates properly for the
updates.  I may have it figured out tomorrow morning.  

Whatever the reason for the files appearing to stay the same,
this means that dirvish backups may not be getting done properly
without the "checksum" flag set.  If we have to checksum every
backup using the rsync checksum, the backups will be much, much
slower.

----------------------------------------------------------------
So, a request!   Could some of you try adding the line:

checksum: 1

To your dirvish master.conf file, so that dirvish runs this way
over the weekend?  Again, backups may take up to 6 times longer
with that flag set!  

Then, look at the log file for the images.  Are there hundreds of
files changed?  Many are temporary files, log files, spool files,
and can be expected to change.

However, use 'ls -l' and see if some of those files have
unexpectedly old dates , especially files in /usr (which should
not change much).  Use diff to find out if there are differences
between those files in your pre-checksum backup images and the
ones in your active filesystem.  Look at your file update
history.

Let us know what you find out.

If a lot of you see files getting missed, we may have some work
ahead of us!

-----------------------------------------------------------------

BTW ...  There are some clever ways to speed that up using the
output of the osiris file integrity monitor program and the 
upgrade logs, but I will save those ideas for another time.
Backup, file updates, and file integrity are three aspects of
what is really one problem.

Keith

-- 
Keith Lofstrom          keithl at keithl.com         Voice (503)-520-1993
KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs


More information about the Dirvish mailing list