Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Oct 2010 18:34:02 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Benjamin Kaduk <kaduk@MIT.EDU>
Cc:        freebsd-hackers@freebsd.org, Sergey Kandaurov <pluknet@gmail.com>, Lars Hartmann <lars@chaotika.org>
Subject:   Re: [PATCH] Fix 'implicit declaration' warning and update vgone(9)
Message-ID:  <20101027153402.GS2392@deviant.kiev.zoral.com.ua>
In-Reply-To: <alpine.GSO.1.10.1010271043550.19200@multics.mit.edu>
References:  <1288160610.4280.18.camel@apollon> <AANLkTimCN%2BY4OsJRNvdmXidLu8XPHz5RydAhe0phvgnH@mail.gmail.com> <201010270912.47076.jhb@freebsd.org> <alpine.GSO.1.10.1010271043550.19200@multics.mit.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

--37ZONDoqvyhZS3Cx
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Oct 27, 2010 at 10:59:56AM -0400, Benjamin Kaduk wrote:
> On Wed, 27 Oct 2010, John Baldwin wrote:
>=20
> >On Wednesday, October 27, 2010 7:33:13 am Sergey Kandaurov wrote:
> >>On 27 October 2010 10:23, Lars Hartmann <lars@chaotika.org> wrote:
> >>>The vgonel function isnt declarated in any header, the vgonel prototype
> >>>in vgone(9) isnt correct - found by Ben Kaduk <kaduk@mit.edu>
> >>
> >>Hi.
> >>
> >>I'm afraid it's just an overlooked man page after many VFS changes in 5=
.x.
> >>As vgonel() is a static (i.e. private and not visible from outside)=20
> >>function
> >>IMO it should be removed from vgone(9) man page.
> >
> >Agreed.  It certainly should not be added to vnode.h.  I'm curious how t=
he
> >reporter is getting a warning since there is a static prototype for=20
> >vgonel()
> >in vfs_subr.c.
>=20
> It's for a third-party kernel module, for the OpenAFS filesystem (which=
=20
> has been unloved for FreeBSD since the 4.X days).  The AFS code is=20
> currently using unlocked accesses to v_usecount (which, unsurprisingly,=
=20
> led to a race condition that caused an invariant check to fail), and I wa=
s=20
> going through and adding the interlock around them.  At the place that I=
=20
> suspect to be the main cause of this race [1], the usecount was checked t=
o=20
> be nonpositive along with a couple other conditions, and a little later=
=20
> vgone() was called.  Holding the interlock across both of these calls (an=
d=20
> therefore using vgonel()) seems to have closed the race condition I was=
=20
> seeing.  (Other checks of v_usecount were changed to grab the interlock,=
=20
> but drop it before doing anything else.)
> However, looking at the commit message for vfs_subr.c:1.631, I guess this=
=20
> is not the locking strategy I'm supposed to be using?
>=20
> I saw a warning of implicit declaration when compiling the kernel module,=
=20
> but the kernel linker was happy to load it.  I forget whether it matters=
=20
> that vgonel is only declared static at its declaration but not its=20
> definition.
>=20
> -Ben Kaduk
>=20
>=20
> [1] The old (racy) function is osi_TryEvictVCache, here:
> http://git.openafs.org/?p=3Dopenafs.git;a=3Dblob;f=3Dsrc/afs/FBSD/osi_vca=
che.c;h=3Dc2060c74f0155a610d2ea94f3c7f508e8ca4373a;hb=3DHEAD
The function looks very strange for much more serious reasons. Why do you
try to manage the vnode revocation in the filesystem module at all ?
VFS will (assumedly in a right way) revoke and destroy non-referenced
vnodes when needed.

Anyway, you need to hold vnode lock before checking for the vnode refcounte=
r.
See the vfs_subr.c:vlrureclaim() for the correct dance with vholdl()/vn_loc=
k()
sequence that does the revocation I mentioned in the race-free way.

--37ZONDoqvyhZS3Cx
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAkzIRmoACgkQC3+MBN1Mb4hBfgCgj/TfCiP37gkAWduqkwKUcVUu
znIAnAjXynCj/fREjD+isFv8JUSwMxko
=dK3o
-----END PGP SIGNATURE-----

--37ZONDoqvyhZS3Cx--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20101027153402.GS2392>