[Dirvish] dirvish not making hard links but copying files, consuming much space

Team Bloombox dirvish at bloombox.de
Thu Jun 15 09:25:08 UTC 2006


Hi Eric,


> -----Original Message-----
> From: dirvish-bounces at dirvish.org 
> [mailto:dirvish-bounces at dirvish.org] On Behalf Of Eric Mountain
> Sent: Sunday, June 11, 2006 7:04 PM
> To: Dirvish user and developer mailing list
> Subject: Re: [Dirvish] dirvish not making hard links but 
> copying files,consuming much space
> 
> On Sunday 11 June 2006 16:55, Team Bloombox spake thus:
> > Hi Eric,
> >
> > the script runs as root. It's a debian package installation. Debian 
> > put's the following cron script into cron.d:
> >
> > # run every night
> > 4 23 * * *     root     /etc/dirvish/dirvish-cronjob
> >
> > I just found out another thing, which needs to be 
> mentioned, I guess. 
> > On another system, which also runs dirvish and which makes 
> a backup of 
> > the
> > nemesis41 webserver too (the one which is causing the problem) the 
> > backup just runs fine. It only consumes about 100MB per run as 
> > proposed by dirvish. And as far as I can see: No files are 
> missing and 
> > only changed files are backuped ...
> > And I think the issue belongs to that very special machine .... :-(
> 
> Looks like it may be a good idea to make dirvish run rsync in 
> verbose mode on both machines and compare output for 2 runs 
> that should be similar.  A tedious task unfortunately...

I will do that this weekend ... no time left before saturday ... ;-)

 
> I would still like to know the following:  What is the output 
> of stat for /etc/apache2/apache2.conf?

nemesis41:~# stat /etc/apache2/apache2.conf
  File: ,,/etc/apache2/apache2.conf"
  Size: 12895           Blocks: 32         IO Block: 4096   reguläre Datei
Device: 302h/770d       Inode: 2686979     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2006-06-13 09:38:24.000000000 +0200
Modify: 2005-08-08 10:47:00.000000000 +0200
Change: 2006-01-29 12:09:34.000000000 +0100
 
> In fact, I would also like to know the output of stat for 
> backups of the above file on the backup server that works OK 
> too please.

and here they are (youngest one first):

nemesis40:/# stat
/backup-dirvish/n41/20060614-2318/tree/etc/apache2/apache2.conf
  File: ,,/backup-dirvish/n41/20060614-2318/tree/etc/apache2/apache2.conf"
  Size: 12895           Blocks: 32         IO Block: 4096   reguläre Datei
Device: 302h/770d       Inode: 8456438     Links: 20
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2006-05-27 15:35:09.000000000 +0200
Modify: 2005-08-08 10:47:00.000000000 +0200
Change: 2006-06-14 23:20:05.000000000 +0200

nemesis40:/# stat
/backup-dirvish/n41/20060610-2318/tree/etc/apache2/apache2.conf
  File: ,,/backup-dirvish/n41/20060610-2318/tree/etc/apache2/apache2.conf"
  Size: 12895           Blocks: 32         IO Block: 4096   reguläre Datei
Device: 302h/770d       Inode: 8456438     Links: 20
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2006-05-27 15:35:09.000000000 +0200
Modify: 2005-08-08 10:47:00.000000000 +0200
Change: 2006-06-14 23:20:05.000000000 +0200

In fact, they are all the same as the file hasn't changed lately:

nemesis40:/# dirvish-locate n41 apache2.conf
2 matches in 20 images
/etc/apache2/apache2.conf
    Aug  8  2005 20060614-2318, 20060613-2318, 20060612-2326, 20060611-2317
                 20060610-2318, 20060609-2318, 20060608-2317, 20060607-2315
                 20060606-2316, 20060605-2317, 20060604-2319, 20060603-2321
                 20060602-2318, 20060601-2318, 20060531-2319, 20060530-2316
                 20060529-2317, 20060528-2316, 20060527-2312, 20060527-1533


> My current guess is that the file is owned by root on the 
> master server, and that backups of the file are owned by 
> tapeback on the server that doesn't work which is causing 
> rsync to consider it can't hardlink the files because the 
> owner doesn't match.  Presumably, the file ownership on the 
> backup server that works is correct which is why the 
> hardlinking works as expected.  Just a guess...
hmm. I must admit, I haven't seen this at all. Or better: I didn't recognize
it before. In fact: It doesn't make any sense to change the user to
tapeback.

I think I'm starting to realize now, what's happening: On server nemesis41
there is an additional partition, where all backup is to be stored (I prefer
it this way as the server won't stop running if backup is consuming too much
space). Beside dirvish there is another script running which is backing up
all log files, mysql dumps and so on. Somewhere burried in the back of my
brain I remember, that someone who was taking care for these servers before
did all the backups with user root but then changing rights of backuped
files to user tapeback:root to ensure, that a "tapeback" user was able to
catch those files from that machine (and no need to use root account).
Now dirvish came and I simply set up the vaults below the "normal" backup
path. I changed the previous backup to only backup log files, mysql dump and
so on, but forgot, that this script was still changing permissions to
tapeback:root. 
And I think you're right when you're guessing, that this permission issue
makes dirvish go "mad".

I will try this out by removing all backups, initializing vaults and then
see, what happens. I will sure be able to tell you, if that's the problem as
the excessive backup situation was mostly happening after three runs.

Best regards,
Jens



More information about the Dirvish mailing list