[Dirvish] Excessively Long backups on a Postfix Mailder vault with Dovecot as the IMAP server

Damian Cunniff dcunniff_lists at inforefinery.com
Wed Mar 19 14:30:00 UTC 2008

Hey Guys,

I've been working on a dirvish backup system now for several months and
we are just at the tail end of everything.  The system is running really
well and it is also gracefully managing the backup of both Windows and
Linux servers.  Dirvish is a very impressive tool.

I am, however, running into a relatively  serious problem with one of my
vaults.  I have a mail server that is running postfix for SMTP and
dovecot for IMAP.  The server has a (software) RAID array of 3, 10,000
RPM SATA disks which house the /var/mail partition.  I have configured
dirvish on separate backup server (still part of the same LAN) with a
vault assigned only to backup this /var/mail partition.  We opted to set
our mail up in a Maildir format to leverage dirivish's capabilities for
backing up the mail store.

When we first implemented the system everything seemed to be running
correctly.  The vault took somewhere between 20 minutes to 1 hour to do
its backups.  Gradually we added new users to the new mail system and
this time window increased but then subsided as would be expected.
Recently however it continues to increase in the amount of time that it
is taking to backup.  It has reached the point of taking as much as 8-10
hours to complete.

Initially I thought we might be collecting enough mail on a daily basis
to create this situation, however that doesn't really make sense when
you look at how much mail actually comes into the server.  Additionally
I took a look at the log.gz files that are generated and each day we are
backing up around 100 - 300 MB of mail.  This number still seems high to
me but I can live with it.  Where it gets very strange is the amount of
mail does not in any way seem proportional to the time it takes for the
backup of this vault to occur.  A couple of days ago it actually backed
up 400MB in 8 hours.  The next day it was 120MB in 10 hours.  I have
other servers in the same network producing the same amount of changed
data and taking 10 - 15 minutes to back it up.

I guessed the problem might be postfix and dovecot running while the
backups were occurring, so last night I set the pre-client command to
shutdown both services and the post-client to bring them back up.  When
I got to the office this morning neither service was running.  Upon
further examination I realized this was because the backup still hadn't

I'm stumped.  What could possibly be causing this problem?  Why would an
otherwise seemingly healthy setup be malfunctioning here?  Should I fsck
the RAID array in hopes that that may be the trouble?


PS.  If this is a repost I apologize, I'm not sure that my first one 
went through.

