Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 03 Dec 2008 14:59:03 +0200
From:      Danny Braniss <danny@cs.huji.ac.il>
To:        David Wolfskill <david@catwhisker.org>
Cc:        hackers@freebsd.org
Subject:   Re: NFS (& amd?) dysfunction descending a hierarchy 
Message-ID:  <E1L7rJn-0004KA-OQ@kabab.cs.huji.ac.il>
In-Reply-To: <20081203124507.GE96383@bunrab.catwhisker.org> 
References:  <20081203001538.GC96383@bunrab.catwhisker.org>  <E1L7qiW-0003np-NF@kabab.cs.huji.ac.il> <20081203124507.GE96383@bunrab.catwhisker.org>

next in thread | previous in thread | raw e-mail | index | archive | help
> 
> --vmttodhTwj0NAgWp
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> On Wed, Dec 03, 2008 at 02:20:32PM +0200, Danny Braniss wrote:
> > ...
> > i'll try to check it here soon, but in the meantime, could you try the sa=
> me
> > but mounting directly, not via amd, to remove one item from the equation?
> > (I don't know how much amd is involved here, but if you are running on a
> > 64bit host, amd could be swapped out, in which case it tends to realy scr=
> ew
> > things up, which is not your case, but ...)
> 
> Sorry; I should have mentioned that the NFS client was running
> RELENG_7_1 as of Monday morning, i386 arch.  The amd.conf file specifies
> "plock" for amd(8).
> 
> Note that merely telling amd(8) to kick the interval of attempted
> unmounts from 2 minutes to 12 hours appears to avoid the observed
> symptoms, so I'm fairly confident that bypassing amd(8) altogether would
> do so as well.
> 
> In looking at the output from ktrace against amd(8), I recall having
> seen that shortly before an observed failure, the (master) amd
> process forks a child to attempt the unmount; the child issues an
> unmount, the return for which is EBUSY (IIRC -- I'm not in a good
> position to check just at the moment), so the child terminates with an
> "interrupted system call".
> 
> I'd have thought that since the attempted unmount failed, it wouldn't
> make any difference, but it's right around that point that rm(1) is told
> that a directory entry it found earlier doesn't exist, which rather
> snowballs into the previously-described symptoms.

so it does point to amd - or something inocent it does - which triggers the 
error.
btw, there are some patches (5 I think), that try to fix some of amd problems.
I've installed them, and things are quiet/ok -most of the time- but I get a
glitch once in a while. would love to iron them out though.

cheers,
	danny





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1L7rJn-0004KA-OQ>