Date: Thu, 30 Apr 2009 17:27:53 -0400 From: Jung-uk Kim <jkim@FreeBSD.org> To: pluknet <pluknet@gmail.com> Cc: Scott Ullrich <sullrich@gmail.com>, freebsd-current@freebsd.org, Andriy Gapon <avg@icyb.net.ua> Subject: Re: Panic "Fatal trap 18: integer divide fault while in kernel mode" Message-ID: <200904301727.55099.jkim@FreeBSD.org> In-Reply-To: <a31046fc0904301409y6db1b591nb6bc4887ab8bff0f@mail.gmail.com> References: <20090429161626.GQ1387@albert.catwhisker.org> <200904301656.51003.jkim@FreeBSD.org> <a31046fc0904301409y6db1b591nb6bc4887ab8bff0f@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--Boundary-00=_bfh+JouM/gBbBo9 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline On Thursday 30 April 2009 05:09 pm, pluknet wrote: > 2009/5/1 Jung-uk Kim <jkim@freebsd.org>: > > On Thursday 30 April 2009 04:25 pm, pluknet wrote: > >> 2009/4/30 Jung-uk Kim <jkim@freebsd.org>: > >> > On Thursday 30 April 2009 12:37 pm, pluknet wrote: > >> >> 2009/4/30 Andriy Gapon <avg@icyb.net.ua>: > >> >> > on 30/04/2009 18:58 David Wolfskill said the following: > >> >> >> On Thu, Apr 30, 2009 at 06:35:32PM +0300, Andriy Gapon wrote: > >> >> >>> on 30/04/2009 18:18 David Wolfskill said the following: > >> >> >>>> On Wed, Apr 29, 2009 at 09:16:26AM -0700, David > >> >> >>>> Wolfskill > >> > > >> > wrote: > >> >> >>>>> Is there anything of use I might get from DDB? > >> >> >>>> > >> >> >>>> I can still poke around there for a bit, if that would > >> >> >>>> be useful. > >> >> >>> > >> >> >>> In general the stack trace[*] should be provided at the > >> >> >>> very least, otherwise people have hard figuring out where > >> >> >>> the problem occurred, so right people may just not notice > >> >> >>> a report. > >> >> >> > >> >> >> Sorry; it happened so quickly, I wasn't at all certain > >> >> >> there would be enough to show: > >> >> >> > >> >> >> db> bt > >> >> >> Tracing pid 0 tid 100000 td 0xc0d43610 > >> >> >> cpu_topo(2,c1420d34,c081ff07,c1420d58,c0820042,...) at > >> >> >> cpu_topo+0x43 smp_topo(c0804378,2,c4145a5c,fffffff,0,...) > >> >> >> at smp_topo+0x10b > >> >> >> sched_setup(0,141ec00,141ec00,141e000,1425000,...) at > >> >> >> sched_setup+0x1a mi_startup() at mi_startup+0x96 > >> >> >> begin() at begin+0x2c > >> >> > > >> >> > My guess is that (cpu_cores * cpu_logical) somehow equals > >> >> > to zero. > >> >> > >> >> That was masked earlier by additional checks on zero, > >> >> and now that routine moved to the separate function > >> >> (and to separate call path from subr_smp.c:mp_start() > >> >> which seems not to be called). > >> >> > >> >> > Have you by a chance saved this crash dump? > >> >> > I think that t would be interesting to look at it in kgdb. > >> > > >> > Please try the attached patch. > >> > > >> > Jung-uk Kim > >> > >> The strange thing is why cpu_mp_start() is called at all in case > >> when there is only one CPU in system. It should early return in > >> mp_start(). (I saw two reports and both of them were UP > >> systems). > > > > I don't think cpu_mp_start() is the culprit. > > Actually you are right. I was wrong and cpu_mp_start() is not > called here on UP. > > > When SMP kernel is used > > on UP system, scheduler still tries to probe topology although it > > should be simply smp_topo_none() instead of calling MD > > cpu_topo(). In fact, I had a simple band-aid in cpu_topo() in my > > local tree to shut up annoying: > > > > WARNING: Non-uniform processors. > > WARNING: Using suboptimal topology. > > > > messages when SMP is forced off or a core is disabled on > > multi-core systems, etc. It wasn't critical before but it is > > now, > > unfortunately. > > > > Jung-uk Kim > > I decided to go another way. Before last changes in mp_machdep.c > cpu_topo() included > previously that piece of code which now is in topo_probe(). > > What if just return that part back to cpu_topo() ? > > David, can you thy this? It works for me now at least. > > $ diff -urp sys/amd64/amd64/mp_machdep.c.orig > sys/amd64/amd64/mp_machdep.c --- sys/amd64/amd64/mp_machdep.c.orig > 2009-05-01 00:59:55.000000000 +0400 +++ > sys/amd64/amd64/mp_machdep.c 2009-05-01 01:00:20.000000000 > +0400 @@ -309,6 +309,8 @@ cpu_topo(void) > { > int cg_flags; > > + topo_probe(); > + > /* > * Determine whether any threading flags are > * necessry. > $ diff -urp sys/i386/i386/mp_machdep.c.orig > sys/i386/i386/mp_machdep.c --- sys/i386/i386/mp_machdep.c.orig > 2009-05-01 01:01:53.000000000 +0400 +++ sys/i386/i386/mp_machdep.c > 2009-05-01 01:01:41.000000000 +0400 @@ -362,6 +362,8 @@ > cpu_topo(void) > { > int cg_flags; > > + topo_probe(); > + > /* > * Determine whether any threading flags are > * necessry. Ah, you're right. More complete patch is attached. Jung-uk Kim --Boundary-00=_bfh+JouM/gBbBo9 Content-Type: text/x-diff; charset="iso-8859-1"; name="mp_machdep.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="mp_machdep.diff" --- sys/i386/i386/mp_machdep.c (revision 191699) +++ sys/i386/i386/mp_machdep.c (working copy) @@ -267,6 +267,8 @@ else if (type == CPUID_TYPE_CORE) cpu_cores = cnt; } + if (cpu_cores == 0) + cpu_cores = 1; if (cpu_logical == 0) cpu_logical = 1; cpu_cores /= cpu_logical; @@ -345,16 +347,21 @@ static void topo_probe(void) { + static int cpu_topo_probed = 0; + if (cpu_topo_probed) + return; + logical_cpus = logical_cpus_mask = 0; if (cpu_high >= 0xb) topo_probe_0xb(); else if (cpu_high) topo_probe_0x4(); if (cpu_cores == 0) - cpu_cores = mp_ncpus; + cpu_cores = mp_ncpus > 0 ? mp_ncpus : 1; if (cpu_logical == 0) cpu_logical = 1; + cpu_topo_probed = 1; } struct cpu_group * @@ -366,6 +373,7 @@ * Determine whether any threading flags are * necessry. */ + topo_probe(); if (cpu_logical > 1 && hyperthreading_cpus) cg_flags = CG_FLAG_HTT; else if (cpu_logical > 1) --- sys/amd64/amd64/mp_machdep.c (revision 191699) +++ sys/amd64/amd64/mp_machdep.c (working copy) @@ -214,6 +214,8 @@ else if (type == CPUID_TYPE_CORE) cpu_cores = cnt; } + if (cpu_cores == 0) + cpu_cores = 1; if (cpu_logical == 0) cpu_logical = 1; cpu_cores /= cpu_logical; @@ -292,16 +294,21 @@ static void topo_probe(void) { + static int cpu_topo_probed = 0; + if (cpu_topo_probed) + return; + logical_cpus = logical_cpus_mask = 0; if (cpu_high >= 0xb) topo_probe_0xb(); else if (cpu_high) topo_probe_0x4(); if (cpu_cores == 0) - cpu_cores = mp_ncpus; + cpu_cores = mp_ncpus > 0 ? mp_ncpus : 1; if (cpu_logical == 0) cpu_logical = 1; + cpu_topo_probed = 1; } struct cpu_group * @@ -313,6 +320,7 @@ * Determine whether any threading flags are * necessry. */ + topo_probe(); if (cpu_logical > 1 && hyperthreading_cpus) cg_flags = CG_FLAG_HTT; else if (cpu_logical > 1) --Boundary-00=_bfh+JouM/gBbBo9--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200904301727.55099.jkim>