From owner-freebsd-current@FreeBSD.ORG Thu Apr 30 21:09:48 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0977A106566C; Thu, 30 Apr 2009 21:09:48 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-fx0-f162.google.com (mail-fx0-f162.google.com [209.85.220.162]) by mx1.freebsd.org (Postfix) with ESMTP id 1CDBA8FC13; Thu, 30 Apr 2009 21:09:46 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: by fxm6 with SMTP id 6so2057995fxm.43 for ; Thu, 30 Apr 2009 14:09:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=W51iQ+Yy2nAtPK/LFG7Y3fvIMqKSs9kVN2BvxsDLdY0=; b=bBCSiqwk8GgLCXgVdMn1nKmh3nHlyMcC8fv9+k8rCwORW9X9D13e4JNpSGQzVU3Uwr lG+DWLKyzmFqwCPQZVfpEoD/lys0YkE5HRz9/Vx2tlBjuSexUY0nvh2YjVnwhiU7Mfbu 2Ck6UbQcZWFyR1dFmiaQoV5R98Gs665T/rhCA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=dkrAn49s48SZEa51bf7UYuEXGSCFAgNmeQoHyc6QON7SD6OP1zaQDQY6gCm2Ks0lM1 tnjThorOscTDeahOZAdW0t34wE+jtgmI1JA8Y4H591huxH6yhZ6SgEUk7e4sEpCIvtV/ HPgWlfAp/6rhjNeOmWd7exmBwsnOyHmwbFw+I= MIME-Version: 1.0 Received: by 10.103.182.3 with SMTP id j3mr1228291mup.107.1241125785836; Thu, 30 Apr 2009 14:09:45 -0700 (PDT) In-Reply-To: <200904301656.51003.jkim@FreeBSD.org> References: <20090429161626.GQ1387@albert.catwhisker.org> <200904301552.03118.jkim@FreeBSD.org> <200904301656.51003.jkim@FreeBSD.org> Date: Fri, 1 May 2009 01:09:45 +0400 Message-ID: From: pluknet To: Jung-uk Kim , David Wolfskill Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, Andriy Gapon , Scott Ullrich Subject: Re: Panic "Fatal trap 18: integer divide fault while in kernel mode" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Apr 2009 21:09:48 -0000 2009/5/1 Jung-uk Kim : > On Thursday 30 April 2009 04:25 pm, pluknet wrote: >> 2009/4/30 Jung-uk Kim : >> > On Thursday 30 April 2009 12:37 pm, pluknet wrote: >> >> 2009/4/30 Andriy Gapon : >> >> > on 30/04/2009 18:58 David Wolfskill said the following: >> >> >> On Thu, Apr 30, 2009 at 06:35:32PM +0300, Andriy Gapon wrote: >> >> >>> on 30/04/2009 18:18 David Wolfskill said the following: >> >> >>>> On Wed, Apr 29, 2009 at 09:16:26AM -0700, David Wolfskill >> > >> > wrote: >> >> >>>>> Is there anything of use I might get from DDB? >> >> >>>> >> >> >>>> I can still poke around there for a bit, if that would be >> >> >>>> useful. >> >> >>> >> >> >>> In general the stack trace[*] should be provided at the very >> >> >>> least, otherwise people have hard figuring out where the >> >> >>> problem occurred, so right people may just not notice a >> >> >>> report. >> >> >> >> >> >> Sorry; it happened so quickly, I wasn't at all certain there >> >> >> would be enough to show: >> >> >> >> >> >> db> bt >> >> >> Tracing pid 0 tid 100000 td 0xc0d43610 >> >> >> cpu_topo(2,c1420d34,c081ff07,c1420d58,c0820042,...) at >> >> >> cpu_topo+0x43 smp_topo(c0804378,2,c4145a5c,fffffff,0,...) at >> >> >> smp_topo+0x10b >> >> >> sched_setup(0,141ec00,141ec00,141e000,1425000,...) at >> >> >> sched_setup+0x1a mi_startup() at mi_startup+0x96 >> >> >> begin() at begin+0x2c >> >> > >> >> > My guess is that (cpu_cores * cpu_logical) somehow equals to >> >> > zero. >> >> >> >> That was masked earlier by additional checks on zero, >> >> and now that routine moved to the separate function >> >> (and to separate call path from subr_smp.c:mp_start() >> >> which seems not to be called). >> >> >> >> > Have you by a chance saved this crash dump? >> >> > I think that t would be interesting to look at it in kgdb. >> > >> > Please try the attached patch. >> > >> > Jung-uk Kim >> >> The strange thing is why cpu_mp_start() is called at all in case >> when there is only one CPU in system. It should early return in >> mp_start(). (I saw two reports and both of them were UP systems). > > I don't think cpu_mp_start() is the culprit. Actually you are right. I was wrong and cpu_mp_start() is not called here on UP. > When SMP kernel is used > on UP system, scheduler still tries to probe topology although it > should be simply smp_topo_none() instead of calling MD cpu_topo(). > In fact, I had a simple band-aid in cpu_topo() in my local tree to > shut up annoying: > > WARNING: Non-uniform processors. > WARNING: Using suboptimal topology. > > messages when SMP is forced off or a core is disabled on multi-core > systems, etc. It wasn't critical before but it is now, > unfortunately. > > Jung-uk Kim > I decided to go another way. Before last changes in mp_machdep.c cpu_topo() included previously that piece of code which now is in topo_probe(). What if just return that part back to cpu_topo() ? David, can you thy this? It works for me now at least. $ diff -urp sys/amd64/amd64/mp_machdep.c.orig sys/amd64/amd64/mp_machdep.c --- sys/amd64/amd64/mp_machdep.c.orig 2009-05-01 00:59:55.000000000 +0400 +++ sys/amd64/amd64/mp_machdep.c 2009-05-01 01:00:20.000000000 +0400 @@ -309,6 +309,8 @@ cpu_topo(void) { int cg_flags; + topo_probe(); + /* * Determine whether any threading flags are * necessry. $ diff -urp sys/i386/i386/mp_machdep.c.orig sys/i386/i386/mp_machdep.c --- sys/i386/i386/mp_machdep.c.orig 2009-05-01 01:01:53.000000000 +0400 +++ sys/i386/i386/mp_machdep.c 2009-05-01 01:01:41.000000000 +0400 @@ -362,6 +362,8 @@ cpu_topo(void) { int cg_flags; + topo_probe(); + /* * Determine whether any threading flags are * necessry. -- wbr, pluknet