Date: Wed, 15 Jun 2016 11:11:43 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Eric Badger <eric@badgerio.us> Cc: FreeBSD Current <freebsd-current@freebsd.org>, peter@holm.cc Subject: Re: Kqueue races causing crashes Message-ID: <20160615081143.GS38613@kib.kiev.ua> In-Reply-To: <34035bf6-8b3c-d15c-765b-94bcc919ea2e@badgerio.us> References: <34035bf6-8b3c-d15c-765b-94bcc919ea2e@badgerio.us>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jun 14, 2016 at 10:26:14PM -0500, Eric Badger wrote: > I believe they all have more or less the same cause. The crashes occur > because we acquire a knlist lock via the KN_LIST_LOCK macro, but when we > call KN_LIST_UNLOCK, the knote???s knlist reference (kn->kn_knlist) has > been cleared by another thread. Thus we are unable to unlock the > previously acquired lock and hold it until something causes us to crash > (such as the witness code noticing that we???re returning to userland with > the lock still held). ... > I believe there???s also a small window where the KN_LIST_LOCK macro > checks kn->kn_knlist and finds it to be non-NULL, but by the time it > actually dereferences it, it has become NULL. This would produce the > ???page fault while in kernel mode??? crash. > > If someone familiar with this code sees an obvious fix, I???ll be happy to > test it. Otherwise, I???d appreciate any advice on fixing this. My first > thought is that a ???struct knote??? ought to have its own mutex for > controlling access to the flag fields and ideally the ???kn_knlist??? field. > I.e., you would first acquire a knote???s lock and then the knlist lock, > thus ensuring that no one could clear the kn_knlist variable while you > hold the knlist lock. The knlist lock, however, usually comes from > whichever event producing entity the knote tracks, so getting lock > ordering right between the per-knote mutex and this other lock seems > potentially hard. (Sometimes we call into functions in kern_event.c with > the knlist lock already held, having been acquired in code outside of > kern_event.c. Consider, for example, calling KNOTE_LOCKED from > kern_exit.c; the PROC_LOCK macro has already been used to acquire the > process lock, also serving as the knlist lock). This sounds as a good and correct analysis. I tried your test program for around a hour on 8-threads machine, but was not able to trigger the issue. Might be Peter have better luck reproducing them. Still, I think that the problem is there. IMO we should simply avoid clearing kn_knlist in knlist_remove(). The member is only used to get the locking function pointers, otherwise code relies on KN_DETACHED flag to detect on-knlist condition. See the patch below. > > Apropos of the knlist lock and its provenance: why is a lock from the > event producing entity used to control access to the knlist and knote? > Is it generally desirable to, for example, hold the process lock while > operating on a knlist attached to that process? It???s not obvious to me > that this is required or even desirable. This might suggest that a > knlist should have its own lock rather than using a lock from the event > producing entity, which might make addressing this problem more > straightforward. Consider the purpose of knlist. It serves as a container for all knotes registered on the given subsystem object, like all knotes of the socket, process etc which must be fired on event. See the knote() code. The consequence is that the subsystem which fires knote() typically already holds a lock protecting its own state. As result, it is natural to protect the list of the knotes to activate on subsystem event, by the subsystem lock. diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c index 0614903..3f45dca 100644 --- a/sys/kern/kern_event.c +++ b/sys/kern/kern_event.c @@ -1341,11 +1341,12 @@ findkn: } /* - * We can get here with kn->kn_knlist == NULL. This can happen when - * the initial attach event decides that the event is "completed" - * already. i.e. filt_procattach is called on a zombie process. It - * will call filt_proc which will remove it from the list, and NULL - * kn_knlist. + * We can get here with kn->kn_knlist == NULL or the knote + * detached. This can happen when the initial attach event + * decides that the event is "completed" already, + * i.e. filt_procattach is called on a zombie process. It + * will call filt_proc which will not add it to the list, and + * leave NULL kn_knlist. */ done_ev_add: if ((kev->flags & EV_ENABLE) != 0) @@ -2073,7 +2075,6 @@ knlist_remove_kq(struct knlist *knl, struct knote *kn, int knlislocked, int kqis if (!knlislocked) knl->kl_lock(knl->kl_lockarg); SLIST_REMOVE(&knl->kl_list, kn, knote, kn_selnext); - kn->kn_knlist = NULL; if (!knlislocked) knl->kl_unlock(knl->kl_lockarg); if (!kqislocked)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160615081143.GS38613>