From owner-freebsd-stable@FreeBSD.ORG Sat Aug 6 20:51:59 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1632816A41F for ; Sat, 6 Aug 2005 20:51:59 +0000 (GMT) (envelope-from fmc@reanimators.org) Received: from lots.reanimators.org (lots.reanimators.org [64.142.28.221]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9F8A143D45 for ; Sat, 6 Aug 2005 20:51:58 +0000 (GMT) (envelope-from fmc@reanimators.org) Received: from lots.reanimators.org (localhost.reanimators.org [127.0.0.1]) by lots.reanimators.org (8.13.3/8.13.3) with ESMTP id j76KphP9063014 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sat, 6 Aug 2005 13:51:43 -0700 (PDT) (envelope-from fmc@lots.reanimators.org) Received: (from fmc@localhost) by lots.reanimators.org (8.13.3/8.13.3/Submit) id j76KphGm063013; Sat, 6 Aug 2005 13:51:43 -0700 (PDT) (envelope-from fmc) Message-Id: <200508062051.j76KphGm063013@lots.reanimators.org> To: dpk References: <200507290034.j6T0YLdZ014411@lots.reanimators.org> <20050729091624.R74149@fledge.watson.org> <200507291809.j6TI9p37035628@lots.reanimators.org> <200508021726.j72HQPQG051111@lots.reanimators.org> <200508022220.j72MKvUt056654@lots.reanimators.org> <200508050514.j755EWpH019403@lots.reanimators.org> <20050806105211.I15658@shared10.hosting.flyingcroc.net> From: Frank McConnell Date: Sat, 06 Aug 2005 13:51:43 -0700 In-Reply-To: <20050806105211.I15658@shared10.hosting.flyingcroc.net> (dpk@dpk.net's message of "Sat, 6 Aug 2005 11:04:11 -0700 (PDT)") MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Ted Wisniewski , freebsd-stable@freebsd.org, Chris Gabe Subject: Re: RELENG_5 PAE panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Aug 2005 20:51:59 -0000 dpk wrote: > On Thu, 4 Aug 2005, Frank McConnell wrote: >> Further debugging led me to the conclusion that the problem is in >> pmap_protect(), in src/sys/i386/i386/pmap.c; and has to do with a [...] >> Then I checked the cvs logs, and saw rev 1.524, which looks like what >> I was thinking about as a fix, so I'm giving it a spin on top of > FWIW, on a server we have which was panicing quite frequently, performing > the above mentioned modification seems to have resolved the issue. The > server has been repeatedly building kernels while having another process > run the server out of RAM. Before, this would cause it to panic with one > of 2 (maybe 3) messages in well under an hour. Now it's been going for 24 > hours straight without even a stray bus error. Great! I'd looked at the stack trace you mentioned in your initial report and really was not sure that you were seeing the same problem. I have two ways to provoke the failure: starting named (a modified BIND 8 which loads blackhole lists for a total memory footprint of somewhere in excess of 900MB), which has provoked the panic in all but one attempt; and "make buildkernel" which will usually provoke the panic some ways in. So I applied this fix to one system that was running RELENG_5 from early this week and it was able to do both, running "make buildkernel" repeatedly (for kicks, alternating between building a kernel based on GENERIC and building one based on PAE) for a couple hours before the sysadmin took it back to the co-lo. I have also applied it to another system that was running 5.4-RELEASE (and missing 2GB of its RAM without PAE). They're both running named without error now. > This appears to resolve i386/84563, and I believe it should resolve > related bugs kern/82846 (identical panic) and i386/84306. > > The specific fix Frank has mentioned is this: > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/pmap.c.diff?r1=1.523&r2=1.524&f=h > > committed by jhb and submitted by Greg Taleck. > > Even though this pmap.c change was applied to a later version than > distributed with FreeBSD 5.4, the modifications still apply. Correct. I applied exactly that two-line change to pmap_remove() by hand. I'd like to see this fixed in RELENG_5, and if possible and appropriate in RELENG_5_4, because it will break on i386 systems with RAM above 4GB that need PAE to see all that RAM. What do I need to do to get this to happen, send a PR, and/or write to re@ and/or security-officer@? I may be able to set some computers up for testing if that would be helpful, but will have to check with the sysadmin to see what his deployment schedule looks like. -Frank McConnell