Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Sep 2013 22:40:21 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Patrick Lamaiziere <patfbsd@davenulle.org>, freebsd-stable@freebsd.org
Subject:   Re: Possible kqueue related issue on STABLE/RC.
Message-ID:  <20130925194021.GB41229@kib.kiev.ua>
In-Reply-To: <20130925161954.GO14220@funkthat.com>
References:  <20130920151705.33aae120@mr129166> <20130923153708.45c3be3d@mr129166> <20130923203141.GV41229@kib.kiev.ua> <20130924094427.0f4b902a@mr129166> <20130924082909.GH41229@kib.kiev.ua> <20130924114738.60c700c9@mr129166> <20130924121434.GI41229@kib.kiev.ua> <20130924174517.GB14220@funkthat.com> <20130924212127.GQ41229@kib.kiev.ua> <20130925161954.GO14220@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--Lt3+jajHwVbO5afw
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Sep 25, 2013 at 09:19:54AM -0700, John-Mark Gurney wrote:
> Konstantin Belousov wrote this message on Wed, Sep 25, 2013 at 00:21 +030=
0:
> > On Tue, Sep 24, 2013 at 10:45:17AM -0700, John-Mark Gurney wrote:
> > > I'd like to understand why you think protecting these functions w/
> > > the _DETACHED check is correct...  In kern_event.c, all calls to
> > > f_detach are followed by knote_drop which will ensure that the knote
> > > is removed and free, so no more f_event calls will be called on that
> > > knote..
> >=20
> > My current belief is that what happens is a glitch in the
> > kqueue_register(). After a new knote is created and attached, the kq
> > lock is dropped and then f_event() is called. If the vnode is reclaimed
> > or possible freed meantime, f_event() seems to dereference freed memory,
> > since kn_hook points to freed vnode.
>=20
> Well, if that happens, then the vnode isn't properly clearing up the
> knote before it gets reclaimed...  It is the vnode's responsibility to
> make sure any knotes that are associated w/ it get cleaned up properly..
See below.

>=20
> > The issue as I see it is that vnode lifecycle is detached from the knote
> > lifecycle.  Might be, only the second patch, which acquires a hold refe=
rence
> > on the vnode for each knote, is really needed.  But before going into a=
ny
> > conclusions, I want to see the testing results.
>=20
> The vnode lifecycle can't/shouldn't be detached from the knote lifecycle
> since the knote contains a pointer to the vnode...  There is the function
> knlist_clear that can be used to clean up knotes when the object goes
> away..
This is done from the vdropl() (null hold count) -> destroy_vpollinfo().
But this is too late, IMO. vdropl() is only executing with the vnode
interlock locked, and knote lock is vnode lock.  This way, you might
get far enough into vdropl in other thread, while trying to operate on
a vnode with zero hold count in some kqueue code path.

We do not drain the vnode lock holders when destroying vnode, because
VFS interface require that anybody locking the vnode own a hold reference
on it.  My short patch should fix exactly this issue, hopefully we will see.

>=20
> I was looking at the code, is there a good reason why you do
> VI_LOCK/VI_UNLOCK to protect the knote fields instead of getting it in
> the vfs_knllock/vfs_knlunlock functions?  Because kq code will modify
> the knote fields w/ only running the vfs_knllock/vfs_knlunlock functions,
> so either the VI_LOCK/VI_UNLOCK are unnecessary, or should be moved to
> vfs_knllock/vfs_knlunlock...

vfs_knllock() is fine. The problematic case if the
VOP_{PRE,POST}->VFS_KNOTE->VN_KNOTE->KNOTE calls from the VOPs. If you
look at the vfs_knl_assert_locked(), you would note that the function
only asserts that vnode is locked, not that it is locked exclusively.
This is because some filesystems started require from VFS to do e.g.
VOP_WRITE() with the vnode only shared-locked, and KNOTE() is called
with shared-locked vnode lock.

The vfs_knllock() obtain the exclusive lock on the vnode, so kqueue
callers are fine. Taking vnode interlock inside the filters provides
enough exclusion for the VOP callers.

--Lt3+jajHwVbO5afw
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.21 (FreeBSD)

iQIcBAEBAgAGBQJSQzwkAAoJEJDCuSvBvK1BDekP/RgJ2MlujaNwLRt65C26SHpn
zo6uV1spTzobO4gxD8CQbFiwr0etmxET5YyixmIaI84m4X3qd+PWlTdcLFvshqSu
QxAzuQddPingKEfVrp917NFmMloz7LAIdE72WtUG5n4Un//QP0o4xQRpVOcSLQkr
8ciahh400WTyc5uj6dygqndl9/DdbkA2TE9HT5RCHZc2av3bkPHmw+HvymqeHejC
dspDcwgjDiiPkjEWlXmSiQV2aZz/g0gcDKFb/0t3tZBR56DQOzScdrsU92GBnz5K
IHt1gBYGIXSnN/6im9N+dDYyDdDkJ8Ac2trcWPcpZNdbevC3V1Pffem6SFThnkvt
+b+Gq/7LEnQhDlVgn1LsTMmu/RrCJrVacvJukr5t++F3UpFv+XFft9QO4v/SUxPP
+3cm7qQAyV2gQHkbQpbkk+t8IB/L6gSP41OeqSrLPWIxFa0qRkJnBCG5T5RgOwbn
InO0alLwFTnqqc8tmnRxn0hMYsS/OhIknp6x3/s14zjyUcHalhzRBlaiZbsEVyW2
srxTB7rggkPzowzOOrXwhCirsXwDZGE2eGjEcdt6bO/S7cn6k9+OPrzMiV+tffKT
JL9j61YGsMwKJR5fZz3Sv9CLJq99aPSEIJV0zWX2crPP0fjicKKm3frCt1I9l1Lw
b+rtvdTDtFH8GmdZRAcd
=cfCM
-----END PGP SIGNATURE-----

--Lt3+jajHwVbO5afw--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130925194021.GB41229>