From owner-freebsd-current@FreeBSD.ORG Wed Apr 2 07:06:26 2014 Return-Path: <owner-freebsd-current@FreeBSD.ORG> Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 49C0D8AE for <freebsd-current@freebsd.org>; Wed, 2 Apr 2014 07:06:26 +0000 (UTC) Received: from smtp.mei.co.jp (smtp.mei.co.jp [133.183.100.20]) by mx1.freebsd.org (Postfix) with ESMTP id DA4801D9 for <freebsd-current@freebsd.org>; Wed, 2 Apr 2014 07:06:25 +0000 (UTC) Received: from mail-gw.jp.panasonic.com ([157.8.1.157]) by smtp.mei.co.jp (8.12.11.20060614/3.7W/kc-maile13) with ESMTP id s3276IfO005620; Wed, 2 Apr 2014 16:06:19 +0900 (JST) Received: from epochmail.jp.panasonic.com ([157.8.1.130]) by mail.jp.panasonic.com (8.11.6p2/3.7W/kc-maili11) with ESMTP id s3276IR25968; Wed, 2 Apr 2014 16:06:18 +0900 Received: by epochmail.jp.panasonic.com (8.12.11.20060308/3.7W/lomi15) id s3276ITb012274; Wed, 2 Apr 2014 16:06:18 +0900 Received: from localhost by lomi15.jp.panasonic.com (8.12.11.20060308/3.7W) with ESMTP id s3276Itp012254; Wed, 2 Apr 2014 16:06:18 +0900 Date: Wed, 02 Apr 2014 16:06:16 +0900 (JST) Message-Id: <20140402.160616.1211219746022675269.okuno.kohji@jp.panasonic.com> To: jmg@funkthat.com Subject: Re: kevent has bug? From: Kohji Okuno <okuno.kohji@jp.panasonic.com> In-Reply-To: <20140402061551.GB3270@funkthat.com> References: <20140402.114516.1300054841784626892.okuno.kohji@jp.panasonic.com> <20140402061551.GB3270@funkthat.com> Organization: Panasonic Corporation X-Mailer: Mew version 6.5 on Emacs 24.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Multipart/Mixed; boundary="--Next_Part(Wed_Apr__2_16_06_16_2014_475)--" Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, okuno.kohji@jp.panasonic.com X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current <freebsd-current.freebsd.org> List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-current>, <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/> List-Post: <mailto:freebsd-current@freebsd.org> List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help> List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, <mailto:freebsd-current-request@freebsd.org?subject=subscribe> X-List-Received-Date: Wed, 02 Apr 2014 07:06:26 -0000 ----Next_Part(Wed_Apr__2_16_06_16_2014_475)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: John-Mark Gurney <jmg@funkthat.com> Date: Tue, 1 Apr 2014 23:15:51 -0700 > Kohji Okuno wrote this message on Wed, Apr 02, 2014 at 11:45 +0900: >> I think, kevent() has a bug. >> I tested sample programs by attached sources. >> This sample tests about EVFILT_SIGNAL. >> >> I build sample programs by the following commands. >> % gcc -O2 -o child child.c >> % gcc -O2 -o parent parent.c >> >> The expected result is the following. >> % ./parent >> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 >> OK >> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 >> OK >> >> But, sometimes the result was the following. >> % ./parent >> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 >> >> This result means the number of times the signal has occured was >> incorrect. > > I was able to reproduce this... > >> In case of EVFILT_SIGNAL, according to `man kevent', `data' retuns the >> number of times the signal has occurred since the last call to >> kevent(). This `data' is recorded by filt_signal() (This is f_event in >> struct filterops). >> >> The system call kevent()'s events are processed by kqueue_scan() in >> kern_event.c. In kqueue_scan(), kn->kn_fop->f_event() is allways >> called after KN_INFLUX is set to kn->kn_status. >> >> On the other hand, kernel events are occured by knote() in >> kern_event.c. (In EVFILT_SIGNAL, knote() is called from tdsendsignal() >> in kern_sig.c.) In knote(), kn->kn_fop->f_event() is called only when >> KN_INFLUX is not set in kn->kn_status. >> >> In race condition between kqueue_scan() and knote(), >> kn->kn_fop->f_event() from knote() may not be called, I think. > > Considering that both are called w/ a lock, that cannot happen.. > KN_LIST_LOCK(kn) locks the same lock that is asserted that is held > by knote... > >> In knote(), because the context holds knlist's lock, the context can >> not sleep. So, KN_INFLUX should not be set on calling >> kn->kn_fop->f_event() in kqueue_scan(), I think. > > No, it needs to be set: > * Setting the KN_INFLUX flag enables you to unlock the kq that this knote > * is on, and modify kn_status as if you had the KQ lock. > > As this comment says, _INFLUX allows you to unlock the KQ w/o fear that > the knote will disappear out from under you causing you to dereference > possibly free'd memory.. > > If you just tried to lock the list lock w/o unlocking the KQ lock, you > could end up w/ a dead lock, as you aren't maintaining lock order > properly.. The correct lock order if knlist -> kq... > >> What do you think about this issue? > > This is a real issue, but not due to the race you described > above... I beleave it's the result of the race. Could you try to add printf() in knote()? Please refer to attached patch. > I have verified on my machine that it isn't because there is a knote > waiting that isn't getting woken up, and the knote on my hung process > has data == 0, so it definately lost one of the signals: > (kgdb) print $14.kq_knhash[20].slh_first[0] > $20 = {kn_link = {sle_next = 0x0}, kn_selnext = {sle_next = 0x0}, > kn_knlist = 0xfffff8005a9c5840, kn_tqe = {tqe_next = 0xfffff801fdab4500, > tqe_prev = 0xfffff8004bb10038}, kn_kq = 0xfffff8004bb10000, kn_kevent = { > ident = 20, filter = -6, flags = 32, fflags = 0, data = 0, udata = 0x0}, > kn_status = 0, kn_sfflags = 0, kn_sdata = 0, kn_ptr = { > p_fp = 0xfffff8005a9c54b8, p_proc = 0xfffff8005a9c54b8, > p_aio = 0xfffff8005a9c54b8, p_lio = 0xfffff8005a9c54b8, > p_v = 0xfffff8005a9c54b8}, kn_fop = 0xffffffff81405ef0, kn_hook = 0x0, > kn_hookid = 0} > > If you want to find this yourself, you can run kgdb on a live system, > switch to the thread of the parent (info threads, thread XXX), and > do: > frame 7 > > (or a frame that has td, which is struct thread *), then: > print *(struct kqueue *)td->td_proc[0].p_fd[0].fd_ofiles[3].fde_file[0].f_data > > This will give you the struct kqueue * of the parent, and then: > print $XX.kq_knhash[0]@63 > > to figure out where the knote is in the hash, and then you can print > it out yourself... > > I'm going to take a look at this a bit more later... I'm thinking of > using dtrace to collect the stacks where filt_signal is called, and > match them up... dtrace might even be able to get us the note's data > upon return helping to make sure things got tracked properly... > > Thanks for finding this bug! Hopefully we can find a solution to it.. > > -- > John-Mark Gurney Voice: +1 415 225 5579 > > "All that I will do, has been done, All that I have, has not." ----Next_Part(Wed_Apr__2_16_06_16_2014_475)-- Content-Type: Text/X-Patch; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="debug.patch" diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c index b3fb23d..7791447 100644 --- a/sys/kern/kern_event.c +++ b/sys/kern/kern_event.c @@ -1868,6 +1868,8 @@ knote(struct knlist *list, long hint, int lockflags) if ((kn->kn_status & KN_INFLUX) != KN_INFLUX) { KQ_LOCK(kq); if ((kn->kn_status & KN_INFLUX) == KN_INFLUX) { + if (hint & NOTE_SIGNAL) + printf("Aee2\n"); KQ_UNLOCK(kq); } else if ((lockflags & KNF_NOKQLOCK) != 0) { kn->kn_status |= KN_INFLUX; @@ -1886,6 +1888,10 @@ knote(struct knlist *list, long hint, int lockflags) KQ_UNLOCK(kq); } } + else { + if (hint & NOTE_SIGNAL) + printf("Aee1\n"); + } kq = NULL; } if ((lockflags & KNF_LISTLOCKED) == 0) ----Next_Part(Wed_Apr__2_16_06_16_2014_475)----