[Dirvish] Excessively Long backups on a Postfix Mailder vault with Dovecot as the IMAP server
dcunniff_lists at inforefinery.com
Wed Mar 19 17:48:23 UTC 2008
Hey Paul and all,
Thank you so much for your reply. Let me see if I can answer a couple
of questions and pose a few more.
I am using Linux for both the backup server and the mail server. My
apologies for not clarifying that. The file system is ext3 and the
operating system is Ubuntu 6.06 Dapper Drake. We are using that because
it is 'Long Time Support' and we want to take advantage of that support.
Unfortunately it means that the most current software is often
unavailable. I did upgrade to rsync 2.6.9 (protocol 29) so that I could
preserve the ACLs on both my Windows and Linux boxes. I was able to
find a backport of that version. I'll have to search for a backport of
the 3.0x version or build one myself. I'd rather have the package though :D
Here is the tail of the log.gz file that was produced during the backup
of this vault last night
Number of files: 836800
Number of files transferred: 2570
Total file size: 25932451041 bytes
Total transferred file size: 133324583 bytes
Literal data: 31600151 bytes
Matched data: 101725725 bytes
File list size: 42784018
File list generation time: 43.928 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 468751
Total bytes received: 74776511
sent 468751 bytes received 74776511 bytes 191220.49 bytes/sec
total size is 25932451041 speedup is 344.64
At first I thought you were saying indexing the files is the problem,
but when I looked at this output it only took 44 seconds to build that
index. On second thought I think that you're saying the actual
comparison of the files to see if they need to be transferred is what is
taking so long. Is this correct?
And on the last note... I disabled the footer signatures... that was
simply my laziness during configuration.
As for the footer, that's the default that Mailscanner is throwing in.
I actually don't like it at all and haven't had the chance to shut it
off. Now that you've embarrassed me I guess I'll have to take it down ;-)
Paul Slootman wrote:
> On Wed 19 Mar 2008, Damian Cunniff wrote:
>> When we first implemented the system everything seemed to be running
>> correctly. The vault took somewhere between 20 minutes to 1 hour to do
>> its backups. Gradually we added new users to the new mail system and
>> this time window increased but then subsided as would be expected.
>> Recently however it continues to increase in the amount of time that it
>> is taking to backup. It has reached the point of taking as much as 8-10
>> hours to complete.
>> Initially I thought we might be collecting enough mail on a daily basis
>> to create this situation, however that doesn't really make sense when
>> you look at how much mail actually comes into the server. Additionally
> The important thing is how much mail is _stored_ on the system, not how
> much is added every day. A very large number of files will cause a long
> backup time, even though not much is actually transferred.
>> I'm stumped. What could possibly be causing this problem? Why would an
>> otherwise seemingly healthy setup be malfunctioning here? Should I fsck
>> the RAID array in hopes that that may be the trouble?
> fsck is probably useless if there aren't actual errors.
> What's more important is what sort of filesystem you're using.
> Assuming you're using linux (you don't say...):
> reiser is bad at sequentially scanning very large directories, as it's
> optimized for fast access to a random file (i.e. where you know the name
> of the file, so it can use the index to find the file).
> jfs is also not so good at random access.
> I usually use either xfs or ext3 with the dirindex option.
> Most importantly, if you haven't upgraded rsync to 3.0.0 yet, I'd advise
> that, as the incremental transfer protocol can speed things up a lot.
> (You need to upgrade both sides to make use of that protocol.)
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
> Such a footer is usually worse than useless, as a virus will usually
> claim something similar...
> Paul Slootman
> Dirvish mailing list
> Dirvish at dirvish.org
More information about the Dirvish