[Dirvish] Weird problems

Matthew Whittaker-Williams m.whittaker-Williams at iu.nl
Mon May 8 06:09:44 UTC 2006


Good morning ( CEST ),

I`ve got this weird problem with running dirvish by cron.
It runs most of the time just fine, but the last couple of months the 
actual backup seems to be done 2 hours later than the dirvish-runall 
command is run.
Now this wouldn`t be much of a problem, but this is gonna be a problem 
in the end when there are alot of backups that need to be done, else it 
won`t finish all the backups before the end of the morning, ofcourse 
this has to do with alot of different cases like bandwidth/proc and 
disks drives etc.
Have any of you seen and experienced this behaviour of dirvish before?.

As in say:

I run the following command from cron:

10      2       *       *       *       /home/tools/dirvish-init

dirvish-init:
#!/bin/sh
export PATH=/usr/bin:/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin

/usr/local/sbin/dirvish-expire && /usr/local/sbin/dirvish-runall

exit 0

So this should run at 02:10 am each day, as it does.
I have alot of pre-client commands doing sql dumps and it seems that the 
first vault actually starts at like 03:40 am.
So it seems that dirvish-expire is taking quiet some time to finish 
before the actual backup is run.
Is this behaviour normal with like +- 50 machine`s to backup?

Another strange thing is the following:
May  8 02:10:00 thoth /usr/sbin/cron[42721]: (root) CMD 
(/home/tools/dirvish-init)

Cron did run but no backup has been made *shrug*, this actually happend 
before but i didn`t managed to replicate this behaviour.

If i take a close look to the vault of the first image to be made there 
is no new image available:


2 drwx------   3 root  tunnel  512 Mar 30 03:47 2006-03-30/
2 drwx------   3 root  tunnel  512 Apr 23 03:08 2006-04-23/
2 drwx------   3 root  tunnel  512 Apr 30 03:06 2006-04-30/
2 drwx------   3 root  tunnel  512 May  1 08:57 2006-05-01/
2 drwx------   3 root  tunnel  512 May  4 05:03 2006-05-04/
2 drwx------   3 root  tunnel  512 May  5 04:24 2006-05-05/
2 drwx------   3 root  tunnel  512 May  6 03:35 2006-05-06/
2 drwx------   3 root  tunnel  512 May  7 03:46 2006-05-07/
2 drwxr-xr-x   2 root  tunnel  512 Mar  5 01:32 dirvish/

Today its: 2006-05-08, so there should be an image, although this hasn`t 
been made :(
Now I didn`t get any output either from the dirvish-runall command so I 
am kind of stuck where to look at this problem.
When I looked at the running processes I haven`t seen any dirvish 
commands come by so I think its broken some where.
So the question here is again have any of you seen this behaviour before?

Configs:

master.conf:

rsh: ssh -C
image-default: %Y-%m-%d
# Compres log files
log: gzip
index: gzip
# Follow devices
xdev: 1

# Security
image-perm: 700
meta-perm: 600

exclude:
        /dev
        /proc
        /compat/linux/proc
        /etc/mtab
        /etc/fstab
        /usr/obj
        lost+found/
        /tmp

expire-default: +7 days

# Banks

bank:
        # Customer banks Volume 1
         /vol01/backup/customers/xxxx

# Vaults

Runall:        
        # Customer vaults
        xxxx-all       03:00


Vault default.conf

client: toor at xxxxx
tree: /
xdev: 0
index: gzip
image-default: %Y-%m-%d
exclude:
        /proc/
        /usr/compat/linux/proc
        /dev
        /mnt/
        *.iso
        /var/tmp/
        /tmp/
        *.log
        *.socket
        *.pid
        *.mpeg
        *.mp3
        */jails/*
pre-client: /home/tools/mysqlbackup.pl
post-server: /usr/local/sbin/dirvish-post
expire-rule:
        wday { sun } +15 days # Keep snapshots created every Sunday for 
15 Days
        mday { 30 } +2 months # Keep snapshots created every 30th for 2 
Months


Well hope to find and fix this problem once and for all.

Kind regards

Matthew Whittaker-Williams




More information about the Dirvish mailing list