From owner-freebsd-current@FreeBSD.ORG  Thu Apr 30 20:56:58 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@FreeBSD.org
Received: from [127.0.0.1] (freefall.freebsd.org [IPv6:2001:4f8:fff6::28])
	by hub.freebsd.org (Postfix) with ESMTP id EDDFF1065670;
	Thu, 30 Apr 2009 20:56:57 +0000 (UTC)
	(envelope-from jkim@FreeBSD.org)
From: Jung-uk Kim <jkim@FreeBSD.org>
To: freebsd-current@FreeBSD.org
Date: Thu, 30 Apr 2009 16:56:38 -0400
User-Agent: KMail/1.6.2
References: <20090429161626.GQ1387@albert.catwhisker.org>
	<200904301552.03118.jkim@FreeBSD.org>
	<a31046fc0904301325p6218e7ccxf68d4087484cd569@mail.gmail.com>
In-Reply-To: <a31046fc0904301325p6218e7ccxf68d4087484cd569@mail.gmail.com>
MIME-Version: 1.0
Content-Disposition: inline
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <200904301656.51003.jkim@FreeBSD.org>
Cc: pluknet <pluknet@gmail.com>, Andriy Gapon <avg@icyb.net.ua>,
	Scott Ullrich <sullrich@gmail.com>
Subject: Re: Panic "Fatal trap 18: integer divide fault while in kernel mode"
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Apr 2009 20:56:58 -0000

On Thursday 30 April 2009 04:25 pm, pluknet wrote:
> 2009/4/30 Jung-uk Kim <jkim@freebsd.org>:
> > On Thursday 30 April 2009 12:37 pm, pluknet wrote:
> >> 2009/4/30 Andriy Gapon <avg@icyb.net.ua>:
> >> > on 30/04/2009 18:58 David Wolfskill said the following:
> >> >> On Thu, Apr 30, 2009 at 06:35:32PM +0300, Andriy Gapon wrote:
> >> >>> on 30/04/2009 18:18 David Wolfskill said the following:
> >> >>>> On Wed, Apr 29, 2009 at 09:16:26AM -0700, David Wolfskill
> >
> > wrote:
> >> >>>>> Is there anything of use I might get from DDB?
> >> >>>>
> >> >>>> I can still poke around there for a bit, if that would be
> >> >>>> useful.
> >> >>>
> >> >>> In general the stack trace[*] should be provided at the very
> >> >>> least, otherwise people have hard figuring out where the
> >> >>> problem occurred, so right people may just not notice a
> >> >>> report.
> >> >>
> >> >> Sorry; it happened so quickly, I wasn't at all certain there
> >> >> would be enough to show:
> >> >>
> >> >> db> bt
> >> >> Tracing pid 0 tid 100000 td 0xc0d43610
> >> >> cpu_topo(2,c1420d34,c081ff07,c1420d58,c0820042,...) at
> >> >> cpu_topo+0x43 smp_topo(c0804378,2,c4145a5c,fffffff,0,...) at
> >> >> smp_topo+0x10b
> >> >> sched_setup(0,141ec00,141ec00,141e000,1425000,...) at
> >> >> sched_setup+0x1a mi_startup() at mi_startup+0x96
> >> >> begin() at begin+0x2c
> >> >
> >> > My guess is that (cpu_cores * cpu_logical) somehow equals to
> >> > zero.
> >>
> >> That was masked earlier by  additional checks on zero,
> >> and now that routine moved to the separate function
> >> (and to separate call path from subr_smp.c:mp_start()
> >> which seems not to be called).
> >>
> >> > Have you by a chance saved this crash dump?
> >> > I think that t would be interesting to look at it in kgdb.
> >
> > Please try the attached patch.
> >
> > Jung-uk Kim
>
> The strange thing is why cpu_mp_start() is called at all in case
> when there is only one CPU in system. It should early return in
> mp_start(). (I saw two reports and both of them were UP systems).

I don't think cpu_mp_start() is the culprit.  When SMP kernel is used 
on UP system, scheduler still tries to probe topology although it 
should be simply smp_topo_none() instead of calling MD cpu_topo().  
In fact, I had a simple band-aid in cpu_topo() in my local tree to 
shut up annoying:

	WARNING: Non-uniform processors.
	WARNING: Using suboptimal topology.

messages when SMP is forced off or a core is disabled on multi-core 
systems, etc.  It wasn't critical before but it is now, 
unfortunately.

Jung-uk Kim