From owner-freebsd-ia64@FreeBSD.ORG Thu Jul 30 09:06:00 2009 Return-Path: Delivered-To: freebsd-ia64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06D53106566C; Thu, 30 Jul 2009 09:06:00 +0000 (UTC) (envelope-from mexas@bristol.ac.uk) Received: from dirj.bris.ac.uk (dirj.bris.ac.uk [137.222.10.78]) by mx1.freebsd.org (Postfix) with ESMTP id B1D678FC16; Thu, 30 Jul 2009 09:05:59 +0000 (UTC) (envelope-from mexas@bristol.ac.uk) Received: from seis.bris.ac.uk ([137.222.10.93]) by dirj.bris.ac.uk with esmtp (Exim 4.69) (envelope-from ) id 1MWRaF-0006AO-D0; Thu, 30 Jul 2009 10:05:58 +0100 Received: from mech-cluster241.men.bris.ac.uk ([137.222.187.241]) by seis.bris.ac.uk with esmtp (Exim 4.67) (envelope-from ) id 1MWRaE-00039u-JZ; Thu, 30 Jul 2009 10:05:55 +0100 Received: from mech-cluster241.men.bris.ac.uk (localhost [127.0.0.1]) by mech-cluster241.men.bris.ac.uk (8.14.3/8.14.3) with ESMTP id n6U95sY9067796; Thu, 30 Jul 2009 10:05:54 +0100 (BST) (envelope-from mexas@bristol.ac.uk) Received: (from mexas@localhost) by mech-cluster241.men.bris.ac.uk (8.14.3/8.14.3/Submit) id n6U95s4d067795; Thu, 30 Jul 2009 10:05:54 +0100 (BST) (envelope-from mexas@bristol.ac.uk) X-Authentication-Warning: mech-cluster241.men.bris.ac.uk: mexas set sender to mexas@bristol.ac.uk using -f Date: Thu, 30 Jul 2009 10:05:54 +0100 From: Anton Shterenlikht To: Alexandre Sunny Kovalenko Message-ID: <20090730090554.GA64840@mech-cluster241.men.bris.ac.uk> References: <4A6DB30B.20705@zedat.fu-berlin.de> <4A6DB9F1.7050404@haruhiism.net> <4A6E0620.6070200@mail.zedat.fu-berlin.de> <20090727210428.GA30253@mech-cluster241.men.bris.ac.uk> <20090728103545.GA22380@mech-cluster241.men.bris.ac.uk> <4A6F09BA.2020703@zedat.fu-berlin.de> <20090728144555.GD75439@mech-cluster241.men.bris.ac.uk> <1248906855.1459.8.camel@RabbitsDen> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1248906855.1459.8.camel@RabbitsDen> User-Agent: Mutt/1.5.20 (2009-06-14) X-Spam-Score: -1.5 X-Spam-Level: - Cc: freebsd-current@freebsd.org, freebsd-ia64@freebsd.org Subject: Re: FreeBSD 8.0-BETA2/amd64 crashes on SMP under load X-BeenThere: freebsd-ia64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the IA-64 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jul 2009 09:06:00 -0000 On Wed, Jul 29, 2009 at 06:34:15PM -0400, Alexandre Sunny Kovalenko wrote: > On Tue, 2009-07-28 at 15:45 +0100, Anton Shterenlikht wrote: > > On Tue, Jul 28, 2009 at 02:22:50PM +0000, O. Hartmann wrote: > > > Anton Shterenlikht wrote: > > > > On Mon, Jul 27, 2009 at 10:04:28PM +0100, Anton Shterenlikht wrote: > > > >> On Mon, Jul 27, 2009 at 09:55:12PM +0200, O. Hartmann wrote: > > > >>> Kamigishi Rei wrote: > > > >>>> O. Hartmann wrote: > > > >>>>> I have the problem of crashing FreeBSD 8.0-BETA2/amd64 under load on > > > >>>>> all of our SMP boxes. Is there an issue known at the moment? If not, I > > > >>>>> will prepare the kernel for whitnessing and provide more informations, > > > >>>>> if you wish. > > > >>>> A quick question: what is in the crash message, i.e. the backtrace? > > > >>>> And what kind of crash is it - a panic() or a fatal trap? > > > >>> On the 8-core server box, I sometimes see : > > > >>> > > > >>> Fatal trap 12: page fault while in kernel mode > > > >>> fault code = supervisor read, page not present > > > >> Not sure if it's related, but on ia64 SMP (2 cpus) with 8.0-current and > > > >> later with 8.0-beta1 (I havent' built beta2 yet) I'm getting crashes > > > >> under load every so often. E.g buildworld -j8 is likely to crash the > > > >> box. No messages, just a sudden freeze, no backtrace or panic, and then reboot. > > > >> > > > >> If load is less heavy, e.g. fewer processes and some idle time, the > > > >> problem doesn't seem to appear. > > > >> > > > >> I'm happy to do any further testing, if suggested. > > > > > > > > my ia64 8.0-beta1 SMP box died again on > > > > make -j8 buildworld > > > > with no panic or log entries. > > > > > > > > Is it possible that some kernel variable needs to > > > > be increased? E.g. kern.maxproc, kern.maxfiles, etc. > > > > Or perhaps I'm talking complete rubbish.. > > > > > > > > > > I suggest you try again with a UP kernel - a suggestion from a > > > kernel-nnob, sorry. My SMP boxes work now with UP-kernel, but they are > > > really slowish although they have modern Intel C2D/Penryn cores. > > > > I need SMP for OpenMP codes. It's a shame if SMP is buggy, but > > I guess all is down to small user base.. > > > Before you go down that path, which, IMHO, is as counterproductive as it > is incorrect, could you, please, show the output of > > sysctl debug | grep panic > sysctl debug|grep panic debug.ddb.textdump.do_panic: 1 debug.trace_on_panic: 1 debug.debugger_on_panic: 1 debug.kdb.panic: 0 > > and check whether output of > > savecore -vC # savecore -vC unable to open bounds file, using 0 checking for kernel dump on device /dev/mirror/swap mediasize = 2147483136 sectorsize = 512 magic mismatch on last dump header on /dev/mirror/swap No dump exists # dumpdev wasn't configured.. I've configured it now, will try crash dump next time. By the way, are these two FreeBSD docs up to date: http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/advanced.html#KERNEL-PANIC-TROUBLESHOOTING http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html In particular, it is still true that minidump is a default dump type? many thanks -- Anton Shterenlikht Room 2.6, Queen's Building Mech Eng Dept Bristol University University Walk, Bristol BS8 1TR, UK Tel: +44 (0)117 928 8233 Fax: +44 (0)117 929 4423