Date: Wed, 15 Jun 2016 13:50:00 +0200 From: Peter Holm <peter@holm.cc> To: Konstantin Belousov <kostikbel@gmail.com> Cc: Eric Badger <eric@badgerio.us>, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: Kqueue races causing crashes Message-ID: <20160615115000.GA23198@x2.osted.lan> In-Reply-To: <20160615081143.GS38613@kib.kiev.ua> References: <34035bf6-8b3c-d15c-765b-94bcc919ea2e@badgerio.us> <20160615081143.GS38613@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jun 15, 2016 at 11:11:43AM +0300, Konstantin Belousov wrote: > On Tue, Jun 14, 2016 at 10:26:14PM -0500, Eric Badger wrote: > > I believe they all have more or less the same cause. The crashes occur > > because we acquire a knlist lock via the KN_LIST_LOCK macro, but when we > > call KN_LIST_UNLOCK, the knote???s knlist reference (kn->kn_knlist) has > > been cleared by another thread. Thus we are unable to unlock the > > previously acquired lock and hold it until something causes us to crash > > (such as the witness code noticing that we???re returning to userland with > > the lock still held). > ... > > I believe there???s also a small window where the KN_LIST_LOCK macro > > checks kn->kn_knlist and finds it to be non-NULL, but by the time it > > actually dereferences it, it has become NULL. This would produce the > > ???page fault while in kernel mode??? crash. > > > > If someone familiar with this code sees an obvious fix, I???ll be happy to > > test it. Otherwise, I???d appreciate any advice on fixing this. My first > > thought is that a ???struct knote??? ought to have its own mutex for > > controlling access to the flag fields and ideally the ???kn_knlist??? field. > > I.e., you would first acquire a knote???s lock and then the knlist lock, > > thus ensuring that no one could clear the kn_knlist variable while you > > hold the knlist lock. The knlist lock, however, usually comes from > > whichever event producing entity the knote tracks, so getting lock > > ordering right between the per-knote mutex and this other lock seems > > potentially hard. (Sometimes we call into functions in kern_event.c with > > the knlist lock already held, having been acquired in code outside of > > kern_event.c. Consider, for example, calling KNOTE_LOCKED from > > kern_exit.c; the PROC_LOCK macro has already been used to acquire the > > process lock, also serving as the knlist lock). > This sounds as a good and correct analysis. I tried your test program > for around a hour on 8-threads machine, but was not able to trigger the > issue. Might be Peter have better luck reproducing them. Still, I think > that the problem is there. > > IMO we should simply avoid clearing kn_knlist in knlist_remove(). The > member is only used to get the locking function pointers, otherwise > code relies on KN_DETACHED flag to detect on-knlist condition. See > the patch below. > > > > > Apropos of the knlist lock and its provenance: why is a lock from the > > event producing entity used to control access to the knlist and knote? > > Is it generally desirable to, for example, hold the process lock while > > operating on a knlist attached to that process? It???s not obvious to me > > that this is required or even desirable. This might suggest that a > > knlist should have its own lock rather than using a lock from the event > > producing entity, which might make addressing this problem more > > straightforward. > > Consider the purpose of knlist. It serves as a container for all knotes > registered on the given subsystem object, like all knotes of the socket, > process etc which must be fired on event. See the knote() code. The > consequence is that the subsystem which fires knote() typically already > holds a lock protecting its own state. As result, it is natural to > protect the list of the knotes to activate on subsystem event, by the > subsystem lock. > > diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c > index 0614903..3f45dca 100644 > --- a/sys/kern/kern_event.c There is not much gdb info here; I'll try to rebuild kgdb. https://people.freebsd.org/~pho/stress/log/kostik900.txt The number of CPUs seems important to this test. Four works for me. - Peter
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160615115000.GA23198>