From owner-freebsd-current@FreeBSD.ORG Tue Nov 10 19:29:30 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D936A1065672 for ; Tue, 10 Nov 2009 19:29:30 +0000 (UTC) (envelope-from freebsd-current@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 616658FC17 for ; Tue, 10 Nov 2009 19:29:30 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.50) id 1N7wPB-00067F-KY for freebsd-current@freebsd.org; Tue, 10 Nov 2009 20:29:29 +0100 Received: from 207.155.204.151.ptr.us.xo.net ([207.155.204.151]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 10 Nov 2009 20:29:29 +0100 Received: from atkin901 by 207.155.204.151.ptr.us.xo.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 10 Nov 2009 20:29:29 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-current@freebsd.org From: Mark Atkinson Date: Tue, 10 Nov 2009 11:29:00 -0800 Lines: 67 Message-ID: References: <1031257439203@webmail57.yandex.ru> <20091105184925.16b55c43@ernst.jennejohn.org> <31221257446063@webmail71.yandex.ru> <20091106101943.5a763f43@ernst.jennejohn.org> <41361257585651@webmail39.yandex.ru> <20091107115256.3df62bc3@ernst.jennejohn.org> <1257618758.1511.14.camel@RabbitsDen> <6511257846119@webmail85.yandex.ru> <20091110105856.1270038e@ernst.jennejohn.org> <1257864452.46072.25.camel@RabbitsDen> <20091110162205.48abcffe@ernst.jennejohn.org> <4AF99D53.9030005@icyb.net.ua> <20091110184821.4f58a0bf@orwell.free.de> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: 207.155.204.151.ptr.us.xo.net User-Agent: Thunderbird 2.0.0.23 (X11/20091009) In-Reply-To: <20091110184821.4f58a0bf@orwell.free.de> Sender: news X-Mailman-Approved-At: Tue, 10 Nov 2009 19:56:57 +0000 Subject: Re: 8.0RC2 amd64 - kernel panic running make buildworld X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Nov 2009 19:29:31 -0000 Kai Gallasch wrote: > Am Tue, 10 Nov 2009 19:05:23 +0200 > schrieb Andriy Gapon : > >> on 10/11/2009 17:22 gary.jennejohn@freenet.de said the following: >>> Well, OK, I may have misinterpreted what you wrote or have chosen >>> bad wording myself to convey the same message. Nonetheless it >>> looks like a hardware problem to me. >> [Trying to make up for my previous mistake.] >> >> The symptom certainly looks like misbehaving hardware, but other >> information from the reports seems to suggest that it is possible >> that this misbehavior might be caused by software misconfiguring the >> hardware. > > Hi. > > This thread was started by me. In the meantime I filed a PR: > http://www.freebsd.org/cgi/query-pr.cgi?pr=140338 > >> I would re-test vm.pmap.pg_ps_enabled=0 just to be sure that it was >> correctly teh first time. > > I toggled vm.pmap.pg_ps_enabled three times between reboots and the > result is always the same. superpages enabled: reboot, superpages not > enabled: server stable > >> I would try to see how 8.0-RC1 kernel behaves and in general try to >> find last working, first non-working version. > 8.0RC1, 8.0BETA4 already showed the same behaviour > >> It would be useful to know any (if any) non-default loader.conf and >> rc.conf settings or kernel config (if not GENERIC). > > loader.conf untouched, rc.conf had just settings for networking active > when testing. In the end I enabled some other stuff to have it ready for > 8.0 RELEASE, *after* I found out that disabling superpages helped > against the crashes. > > Ah yes. I also ran memtest86 on the server for about half a day - no > problems. > > But read for yourself in the PR. > > I don't rule out that this behaviour with vm.pmap.pg_ps_enabled maybe > hardware related, but why then is the server running stable > with RELENG_7 and memtest and server diagnostics don't report any > problem? See the following, where I noticed this problem first a long time ago on my HPDL385g5. It also passed memtest86 for days and I was able to swap out memory modules to the same result. http://article.gmane.org/gmane.os.freebsd.current/111307 I suspect this is actually a machine check exception you're seeing, which you'll notice if you enable hw.mca.enabled="1", and superpages, then do buildworld. Using -j doesn't matter, it's just takes longer to throw an exception. I'm hoping this is the rev E lfence problem, even though my chips are not targetted. When and if a patch goes into -current, I'll try it out to see if the problem with superpages goes away. -Mark