From owner-freebsd-current@FreeBSD.ORG Wed Apr 2 06:15:59 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ABC00AB5 for ; Wed, 2 Apr 2014 06:15:59 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7CD4BD7C for ; Wed, 2 Apr 2014 06:15:59 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s326Fq35006838 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 1 Apr 2014 23:15:52 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s326FpPZ006837; Tue, 1 Apr 2014 23:15:51 -0700 (PDT) (envelope-from jmg) Date: Tue, 1 Apr 2014 23:15:51 -0700 From: John-Mark Gurney To: Kohji Okuno Subject: Re: kevent has bug? Message-ID: <20140402061551.GB3270@funkthat.com> Mail-Followup-To: Kohji Okuno , freebsd-current@freebsd.org References: <20140402.114516.1300054841784626892.okuno.kohji@jp.panasonic.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140402.114516.1300054841784626892.okuno.kohji@jp.panasonic.com> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Tue, 01 Apr 2014 23:15:52 -0700 (PDT) Cc: freebsd-current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Apr 2014 06:15:59 -0000 Kohji Okuno wrote this message on Wed, Apr 02, 2014 at 11:45 +0900: > I think, kevent() has a bug. > I tested sample programs by attached sources. > This sample tests about EVFILT_SIGNAL. > > I build sample programs by the following commands. > % gcc -O2 -o child child.c > % gcc -O2 -o parent parent.c > > The expected result is the following. > % ./parent > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 > OK > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 > OK > > But, sometimes the result was the following. > % ./parent > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 > > This result means the number of times the signal has occured was > incorrect. I was able to reproduce this... > In case of EVFILT_SIGNAL, according to `man kevent', `data' retuns the > number of times the signal has occurred since the last call to > kevent(). This `data' is recorded by filt_signal() (This is f_event in > struct filterops). > > The system call kevent()'s events are processed by kqueue_scan() in > kern_event.c. In kqueue_scan(), kn->kn_fop->f_event() is allways > called after KN_INFLUX is set to kn->kn_status. > > On the other hand, kernel events are occured by knote() in > kern_event.c. (In EVFILT_SIGNAL, knote() is called from tdsendsignal() > in kern_sig.c.) In knote(), kn->kn_fop->f_event() is called only when > KN_INFLUX is not set in kn->kn_status. > > In race condition between kqueue_scan() and knote(), > kn->kn_fop->f_event() from knote() may not be called, I think. Considering that both are called w/ a lock, that cannot happen.. KN_LIST_LOCK(kn) locks the same lock that is asserted that is held by knote... > In knote(), because the context holds knlist's lock, the context can > not sleep. So, KN_INFLUX should not be set on calling > kn->kn_fop->f_event() in kqueue_scan(), I think. No, it needs to be set: * Setting the KN_INFLUX flag enables you to unlock the kq that this knote * is on, and modify kn_status as if you had the KQ lock. As this comment says, _INFLUX allows you to unlock the KQ w/o fear that the knote will disappear out from under you causing you to dereference possibly free'd memory.. If you just tried to lock the list lock w/o unlocking the KQ lock, you could end up w/ a dead lock, as you aren't maintaining lock order properly.. The correct lock order if knlist -> kq... > What do you think about this issue? This is a real issue, but not due to the race you described above... I have verified on my machine that it isn't because there is a knote waiting that isn't getting woken up, and the knote on my hung process has data == 0, so it definately lost one of the signals: (kgdb) print $14.kq_knhash[20].slh_first[0] $20 = {kn_link = {sle_next = 0x0}, kn_selnext = {sle_next = 0x0}, kn_knlist = 0xfffff8005a9c5840, kn_tqe = {tqe_next = 0xfffff801fdab4500, tqe_prev = 0xfffff8004bb10038}, kn_kq = 0xfffff8004bb10000, kn_kevent = { ident = 20, filter = -6, flags = 32, fflags = 0, data = 0, udata = 0x0}, kn_status = 0, kn_sfflags = 0, kn_sdata = 0, kn_ptr = { p_fp = 0xfffff8005a9c54b8, p_proc = 0xfffff8005a9c54b8, p_aio = 0xfffff8005a9c54b8, p_lio = 0xfffff8005a9c54b8, p_v = 0xfffff8005a9c54b8}, kn_fop = 0xffffffff81405ef0, kn_hook = 0x0, kn_hookid = 0} If you want to find this yourself, you can run kgdb on a live system, switch to the thread of the parent (info threads, thread XXX), and do: frame 7 (or a frame that has td, which is struct thread *), then: print *(struct kqueue *)td->td_proc[0].p_fd[0].fd_ofiles[3].fde_file[0].f_data This will give you the struct kqueue * of the parent, and then: print $XX.kq_knhash[0]@63 to figure out where the knote is in the hash, and then you can print it out yourself... I'm going to take a look at this a bit more later... I'm thinking of using dtrace to collect the stacks where filt_signal is called, and match them up... dtrace might even be able to get us the note's data upon return helping to make sure things got tracked properly... Thanks for finding this bug! Hopefully we can find a solution to it.. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."