From owner-freebsd-current@freebsd.org Thu Aug 27 18:09:52 2015 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D0F299C28CB for ; Thu, 27 Aug 2015 18:09:52 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "gold.funkthat.com", Issuer "gold.funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A771CE8E; Thu, 27 Aug 2015 18:09:52 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.14.5/8.14.5) with ESMTP id t7RI9kp7056743 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 27 Aug 2015 11:09:46 -0700 (PDT) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.14.5/8.14.5/Submit) id t7RI9jAa056742; Thu, 27 Aug 2015 11:09:45 -0700 (PDT) (envelope-from jmg) Date: Thu, 27 Aug 2015 11:09:45 -0700 From: John-Mark Gurney To: Andriy Gapon Cc: FreeBSD Current , Lawrence Stewart , Pawel Pekala , "K. Macy" Subject: Re: Instant panic while trying run ports-mgmt/poudriere Message-ID: <20150827180945.GW33167@funkthat.com> References: <20150714223829.GY8523@funkthat.com> <20150715174616.652d0aea@FreeBSD.org> <20150715180526.GM8523@funkthat.com> <20150715223703.78b9197c@FreeBSD.org> <20150806233328.47a02594@FreeBSD.org> <55CB5428.2090505@room52.net> <55D96E24.9060106@FreeBSD.org> <20150826233616.GU33167@funkthat.com> <55DEBA8B.5060009@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55DEBA8B.5060009@FreeBSD.org> X-Operating-System: FreeBSD 9.1-PRERELEASE amd64 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (gold.funkthat.com [127.0.0.1]); Thu, 27 Aug 2015 11:09:46 -0700 (PDT) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Aug 2015 18:09:52 -0000 Andriy Gapon wrote this message on Thu, Aug 27, 2015 at 10:21 +0300: > On 27/08/2015 02:36, John-Mark Gurney wrote: > > We should/cannot get here w/ an empty list. If we do, then there is > > something seriously wrong... The current kn (which we must have as we > > are here) MUST be on the list, but as you just showed, there are no > > knotes on the list. > > > > Can you get me a print of the knote? That way I can see what flags > > are on it? > > Apologies if the following might sound a little bit patronizing, but it > seems that you have got all the facts correctly, but somehow the > connection between them did not become clear. > > So: > 1. The list originally is NOT empty. I guess that it has one entry, but > that's an unimportant detail. > 2. This is why the loop is entered. It's a fact that it is entered. > 3. The list becomes empty precisely because the entry is removed during > the iteration in the loop (as kib has explained). It's a fact that the > list became empty at least in the panic that I reported. On you're latest dump, you said: Here is another +1 with r286922. I can add a couple of bits of debugging data: (kgdb) fr 8 #8 0xffffffff80639d60 in knote (list=0xfffff8019a733ea0, hint=2147483648, lockflags=) at /usr/src/sys/kern/kern_event.c:1964 1964 } else if ((lockflags & KNF_NOKQLOCK) != 0) { First off, that can't be r286922, per: https://svnweb.freebsd.org/base/stable/10/sys/kern/kern_event.c?annotate=286922 line 1964 is blank... The line of code above should be at line 1884, so not sure what is wrong here... Assuming that the pc really is at the line, f_event has not yet been called, which is why I said that the list cannot be empty yet, as f_event hasn't been called yet to remove the knote... It could be that optimization moved stuff around, but if that is the case, then the above wasn't useful.. > 4. The element is not only unlinked from the list, but its memory is > also freed. Where is the memory freed? A knote MUST NOT be freed in an f_event handler. The only location that a list element is allowed to be freed is in knote_drop, which must happen after f_detach is called, but that can't/won't happen from knote (I believe the timer handles this specially, but we are talking about normal knlist type filters).. The rest of your explination is invalid due to the invalid assumption of this point... If you can provide to me where the knote is free'd in knote, w/ function/line number stack trace (does not have to be dump, but a sample call path), then I'll reconsider, and fix that bug... > 5. That's why we have the use after free: SLIST_FOREACH is trying to get > a pointer to a next element from the freed memory. > 6. This is why the commit for trashing the freed memory made all the > difference: previously the freed memory was unlikely to be re-used / > modified, so the use-after-free had a high chance of succeeding. It's a > fact that in my panic there was an attempt to dereference a trashed pointer. > 7. Finally, this is why SLIST_FOREACH_SAFE helps here: we stash the > pointer to the next element beforehand and, thus, we do not access the > freed memory. > > Please let me know if you see any fault in above reasoning or if > something is still no clear. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."