From owner-freebsd-current@freebsd.org Wed Jun 29 22:25:38 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D7CEDB86CB7 for ; Wed, 29 Jun 2016 22:25:38 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B4D8B2748 for ; Wed, 29 Jun 2016 22:25:38 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id u5TMPTvS086000; Wed, 29 Jun 2016 15:25:33 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201606292225.u5TMPTvS086000@gw.catspoiler.org> Date: Wed, 29 Jun 2016 15:25:29 -0700 (PDT) From: Don Lewis Subject: Re: FreeBSD 11.0-ALPHA5 r302256 kernel panic in filt_proc() To: kostikbel@gmail.com cc: freebsd-current@FreeBSD.org In-Reply-To: <20160629222112.GM38613@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jun 2016 22:25:38 -0000 On 30 Jun, Konstantin Belousov wrote: > On Wed, Jun 29, 2016 at 03:03:54PM -0700, Don Lewis wrote: >> On 30 Jun, Konstantin Belousov wrote: >> > On Wed, Jun 29, 2016 at 02:44:08PM -0700, Don Lewis wrote: >> >> #10 0xffffffff80a02ddc in filt_proc (kn=0xfffff803c5679a80, >> >> hint=) at /usr/src/sys/kern/kern_event.c:473 >> >> #11 0xffffffff80a0173b in knote (list=, hint=2147483648, >> >> lockflags=) at /usr/src/sys/kern/kern_event.c:2045 >> >> #12 0xffffffff80a0710e in exit1 (td=, >> >> rval=, signo=) >> >> at /usr/src/sys/kern/kern_exit.c:515 >> >> #13 0xffffffff80a0677d in sys_sys_exit (td=0xfffff803c5679a80, >> >> uap=) at /usr/src/sys/kern/kern_exit.c:178 >> >> #14 0xffffffff80eb8b2b in amd64_syscall (td=0xfffff80096b49500, traced=0) >> >> at subr_syscall.c:135 >> >> #15 0xffffffff80e98d9b in Xfast_syscall () >> >> at /usr/src/sys/amd64/amd64/exception.S:396 >> >> #16 0x00000008009298ca in ?? () >> >> Previous frame inner to this frame (corrupt stack?) >> >> Current language: auto; currently minimal >> >> (kgdb) >> >> >> >> >> >> The line numbers above seem to be off. With kgdb from ports I see: >> >> >> >> (kgdb) up >> >> #12 filt_proc (kn=0xfffff803c5679a80, hint=) >> >> at /usr/src/sys/kern/kern_event.c:466 >> >> 466 kn->kn_data = KW_EXITCODE(p->p_xexit, p->p_xsig); >> >> (kgdb) print kn >> >> $1 = (struct knote *) 0xfffff803c5679a80 >> >> (kgdb) print p >> >> $2 = (struct proc *) 0x0 >> >> >> > Please print out the knote, do 'p *kn'. I am esp. interested in the >> > kn->kn_status value. It seems that the knote was already detached, >> >> (kgdb) print *kn >> $1 = {kn_link = {sle_next = 0x0}, kn_selnext = {sle_next = 0x0}, >> kn_knlist = 0xfffff804a4770d40, kn_tqe = {tqe_next = 0x0, >> tqe_prev = 0xfffff801b1581638}, kn_kq = 0xfffff801b1581600, kn_kevent = { >> ident = 70248, filter = -5, flags = 32816, fflags = 2147483648, data = 0, >> udata = 0x0}, kn_status = 131, kn_sfflags = -2147483648, kn_sdata = 0, >> kn_ptr = {p_fp = 0x0, p_proc = 0x0, p_aio = 0x0, p_lio = 0x0, >> p_nexttime = 0x0, p_v = 0x0}, kn_fop = 0xffffffff818ed600 , >> kn_hook = 0x0, kn_hookid = 0} > > I probably have a plausible explanation. The knote is on knlist, it is > registered for NOTE_EXIT (kn_sfflags == NOTE_EXIT), and most likely, it > was registered when the corresponding process was already in exit1(), so > that P_WEXIT flag was set. Then, the attach filter activates the knote > immediately, it cannot know how far the exit1() progressed, it might > have already run past the KNOTE_LOCKED() call. Failure occured because > filt_proc clears p_proc for the note of exiting process. > > The note was activated for sure: EV_EOF | EV_ONESHOT are set in kn_flags, > KN_ACTIVE | KN_QUEUED are set in kn_status. I believe that the check > for p_proc == NULL in filter is all what is needed to correct the issue, > it would avoid double-activation. > > Sorry for the trouble. > > diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c > index 84bef45..575a330 100644 > --- a/sys/kern/kern_event.c > +++ b/sys/kern/kern_event.c > @@ -451,6 +451,9 @@ filt_proc(struct knote *kn, long hint) > u_int event; > > p = kn->kn_ptr.p_proc; > + if (p == NULL) /* already activated, from attach filter */ > + return (0); > + > /* Mask off extra data. */ > event = (u_int)hint & NOTE_PCTRLMASK; > I'll give this a try. It seems to be a difficult bug to trigger. The machine was up and building ports for about 24 hours before it crashed.