Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Dec 2008 19:06:20 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        David Wolfskill <david@catwhisker.org>, Rick Macklem <rmacklem@uoguelph.ca>, hackers@freebsd.org, current@freebsd.org
Subject:   Re: NFS (& amd?) dysfunction descending a hierarchy
Message-ID:  <20081210170620.GS2038@deviant.kiev.zoral.com.ua>
In-Reply-To: <20081210165022.GJ60731@albert.catwhisker.org>
References:  <20081203001538.GC96383@bunrab.catwhisker.org> <20081209190110.GW60731@albert.catwhisker.org> <Pine.GSO.4.63.0812101124430.24743@muncher.cs.uoguelph.ca> <20081210165022.GJ60731@albert.catwhisker.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--TRAzd1zqvkbVQS90
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Dec 10, 2008 at 08:50:22AM -0800, David Wolfskill wrote:
> On Wed, Dec 10, 2008 at 11:30:26AM -0500, Rick Macklem wrote:
> >...=20
> > The different behaviour for -CURRENT could be the newer RPC layer that
> > was recently introduced, but that doesn't explain the basic problem.
>=20
> OK.
>=20
> > All I can think of is to ask the obvious question. "Are you using
> > interruptible or soft mounts?" If so, switch to hard mounts and see
> > if the problem goes away. (imho, neither interruptible nor soft mounts
> > are a good idea. You can use a forced dismount if there is a crashed
> > NFS server that isn't coming back anytime soon.)
>=20
> From examination of /etc/amd* -- I don't see how to get mount(8) or
> amq(8) to report it -- it appears that we are using interruptible
> mounts, as we always have.
>=20
> The point is that the behavior has changed in an unexpected way.  And
> I'm not so sure that the use of a forced dismount is generally
> available, as it would require logging in to the NFS client first, which
> may be difficult if the NFS server hosting non-root home directories is
> failing to respond and direct root login via ssh(1) is not permitted (as
> is the default).
>=20
> > If you are getting this with hard mounts, I'm afraid I have no idea
> > what the problem is, rick.
>=20
> What concerns me is that even if the attempted unmount gets EBUSY, the
> user-level process descending the directory hierarchy is getting ENOENT
> trying to issue fstatfs() against an open file descriptor.
>=20
> I'm having trouble figuring out any way that makes any sense.

Basically, the problem is that NFS uses shared lookup, and this allows
for the bug where several negative namecache entries are created for
non-existent node. Then this node gets created, removing only the first
negative namecache entry. For some reasons, vnode is reclaimed; amd'
tasting of unmount is a good reason for vnode to be reclaimed.

Now, you have existing path and a negative cache entry. This was
reported by Peter Holm first, I listed relevant revisions that
should fix this in previous mail.

--TRAzd1zqvkbVQS90
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (FreeBSD)

iEYEARECAAYFAkk/9wwACgkQC3+MBN1Mb4h0/QCgiRKkwR+u0kcvEVdC3RxdPskp
c5MAoKKMfVJelmr3tQ1aOar81q7Ydpxt
=nQ99
-----END PGP SIGNATURE-----

--TRAzd1zqvkbVQS90--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081210170620.GS2038>