Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Nov 2007 00:50:42 +0100
From:      Juergen Lock <nox@jelal.kn-bremen.de>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, freebsd-emulation@freebsd.org
Subject:   Re: double panic, and whats apic_cmd? (kqemu crash...)
Message-ID:  <20071128235042.GA40147@saturn.kn-bremen.de>
In-Reply-To: <200711270824.55839.jhb@freebsd.org>
References:  <20071118020533.GA57425@saturn.kn-bremen.de> <20071118224345.GA81339@saturn.kn-bremen.de> <200711270824.55839.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Nov 27, 2007 at 08:24:55AM -0500, John Baldwin wrote:
> On Sunday 18 November 2007 05:43:45 pm Juergen Lock wrote:
> > On Sun, Nov 18, 2007 at 03:05:33AM +0100, Juergen Lock wrote:
> > > Ok I finally have an amd64 smp box here that i can play with, and tried
> > > to reproduce http://www.freebsd.org/cgi/query-pr.cgi?pr=113430 - and I got
> > > the following crash:
> > >[...]
> > 
> > Ok, the crashes seem to be pretty random, I got a few more:
> > (btw I disabled -DSMP in the kqemu build since it doesn't seem to help,
> > and it doesn't seem to be used anywhere else.  Also I forgot to say
> > I also have KDB_TRACE and KDB_UNATTENDED in the kernel config.  Oh and
> > I had a few hangs too, and never could get into ddb in those cases...)
> > 
> > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> > GNU gdb 6.1.1 [FreeBSD]
> > Copyright 2004 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and you are
> > welcome to change it and/or distribute copies of it under certain conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > This GDB was configured as "amd64-marcel-freebsd".
> > 
> > Unread portion of the kernel message buffer:
> > kernel trap 12 with interrupts disabled
> > 
> > 
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 1; apic id = 01
> > fault virtual address	= 0x246
> > fault code		= supervisor read instruction, page not present
> > instruction pointer	= 0x8:0x246
> > stack pointer	        = 0x10:0xffffffff9fae4b50
> > frame pointer	        = 0x10:0xffffffff9fae4b80
> > code segment		= base 0x0, limit 0xfffff, type 0x1b
> > 			= DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags	= resume, IOPL = 0
> > current process		= 11 (idle: cpu1)
> > trap number		= 12
> > <0>
> > 
> > 
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 1; apic id = 01
> > fault virtual address	= 0xc011dbfb
> > fault code		= supervisor read instruction, page not present
> > instruction pointer	= 0x8:0xc011dbfb
> > stack pointer	        = 0x10:0xffffffff9fae47d0
> > frame pointer	        = 0x10:0x801de4000
> > code segment		= base 0x0, limit 0xfffff, type 0x1b
> > 			= DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags	= trace trap, interrupt enabled, nested task, IOPL = 3
> > current process		= 11 (idle: cpu1)
> > trap number		= 12
> > panic: page fault
> > cpuid = 1
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> > panic() at panic+0x17a
> > trap_fatal() at trap_fatal+0x29f
> > trap_pfault() at trap_pfault+0x294
> > trap() at trap+0x2ea
> > sendsig() at sendsig+0x2aa
> > sched_choose() at sched_choose+0x8c
> > choosethread() at choosethread+0x2b
> > sched_switch() at sched_switch+0x184
> > mi_switch() at mi_switch+0x189
> > ast() at ast+0x1e8
> > doreti_ast() at doreti_ast+0x1f
> > Uptime: 37m8s
> > Physical memory: 986 MB
> > Dumping 152 MB: 137 121 105 89 73 57 41 25 9
> > 
> > #0  doadump () at pcpu.h:194
> > 194		__asm __volatile("movq %%gs:0,%0" : "=r" (td));
> > (kgdb) bt
> > #0  doadump () at pcpu.h:194
> > #1  0xffffffff80484b18 in boot (howto=260) at ../../../kern/kern_shutdown.c:409
> > #2  0xffffffff80484f77 in panic (fmt=Variable "fmt" is not available.
> > ) at ../../../kern/kern_shutdown.c:563
> > #3  0xffffffff8070de6f in trap_fatal (frame=0xc, eva=Variable "eva" is not available.
> > )
> >     at ../../../amd64/amd64/trap.c:697
> > #4  0xffffffff8070e254 in trap_pfault (frame=0xffffffff9fae4720, usermode=0)
> >     at ../../../amd64/amd64/trap.c:614
> > #5  0xffffffff8070ec0a in trap (frame=0xffffffff9fae4720)
> >     at ../../../amd64/amd64/trap.c:383
> > #6  0xffffffff806fcd4a in sendsig (catcher=0x405460, ksi=Variable "ksi" is not available.
> > )
> >     at ../../../amd64/amd64/machdep.c:326
> > #7  0xffffffff804a16ec in sched_choose () at ../../../kern/sched_4bsd.c:1256
> > #8  0xffffffff804a174b in choosethread () at kern_switch.c:137
> > #9  0xffffffff804a2984 in sched_switch (td=0xffffff000209b680, 
> >     newtd=0xffffff00021a18c0, flags=13) at ../../../kern/sched_4bsd.c:907
> > #10 0xffffffff8048cc99 in mi_switch (flags=2, newtd=0x0)
> >     at ../../../kern/kern_synch.c:442
> > #11 0xffffffff804b7068 in ast (framep=0xffffffff9fae4c70)
> >     at ../../../kern/subr_trap.c:239
> > #12 0xffffffff806f4999 in doreti_ast () at ../../../amd64/amd64/exception.S:468
> > #13 0x0000000811d87d74 in ?? ()
> > #14 0x0000000000000005 in ?? ()
> > #15 0x00000000000010e0 in ?? ()
> > ---Type <return> to continue, or q <return> to quit---
> > #16 0x0000000811d87d8c in ?? ()
> > #17 0x0000000801de4000 in ?? ()
> > #18 0x0000000741e00000 in ?? ()
> > #19 0x000000000215dd30 in ?? ()
> > #20 0x0000000000d49160 in ?? ()
> > #21 0x00000000c016fdf0 in ?? ()
> > #22 0x0000000000000000 in ?? ()
> > #23 0x0000000801de84d0 in ?? ()
> > #24 0xffffffffbfffffff in ?? ()
> > #25 0x0000000000063fff in ?? ()
> > #26 0x0000000801de4000 in ?? ()
> > #27 0x0000000000063fff in ?? ()
> > #28 0x0000000000000016 in ?? ()
> > #29 0x0000000000000000 in ?? ()
> > #30 0x0000000000000000 in ?? ()
> > #31 0x0000000000000000 in ?? ()
> > #32 0x000000000215dd0c in ?? ()
> > #33 0x000000000000002b in ?? ()
> > #34 0x0000000000000286 in ?? ()
> > #35 0x00007fffffffb608 in ?? ()
> > #36 0x0000000000000023 in ?? ()
> > #37 0x0000000000000000 in ?? ()
> > #38 0x0000000000000000 in ?? ()
> > ---Type <return> to continue, or q <return> to quit---
> > #39 0x0000000000c9f000 in ?? ()
> > #40 0x00000000fffffffd in ?? ()
> > #41 0xffffff0001080460 in ?? ()
> > #42 0xffffff000209b680 in ?? ()
> > #43 0x0000000000000001 in ?? ()
> > #44 0xffffffff9fae4bb0 in ?? ()
> > #45 0xffffffff9fae4b68 in ?? ()
> > #46 0xffffff00010819c0 in ?? ()
> > #47 0xffffffff804a2984 in sched_switch (td=0xd49160, newtd=0x63fff, 
> >     flags=409599) at ../../../kern/sched_4bsd.c:907
> > Previous frame inner to this frame (corrupt stack?)
> > (kgdb) q
> > iapetus# exit
> > 
> >  and
> > 
> > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> > GNU gdb 6.1.1 [FreeBSD]
> > Copyright 2004 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and you are
> > welcome to change it and/or distribute copies of it under certain conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > This GDB was configured as "amd64-marcel-freebsd".
> > 
> > Unread portion of the kernel message buffer:
> > kernel trap 12 with interrupts disabled
> > 
> > 
> > Fatal trap 0:  while in kernel mode
> > cpuid = 0; apic id = 00
> > instruction pointer	= 0x4300:0xffffffff9fae41c0
> > stack pointer	        = 0x10:0xffffffff9fae4190
> > frame pointer	        = 0x10:0x5
> > code segment		= base 0x0, limit 0x0, type 0x0
> > 			= DPL 0, pres 0, long 0, def32 0, gran 0
> > processor eflags	= resume, IOPL = 0
> > current process		= 904 (qemu-system-x86_64)
> > trap number		= kernel trap 12 with interrupts disabled
> > 
> > 
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = 00
> > fault virtual address	= 0x46
> > fault code		= supervisor read data, page not present
> > instruction pointer	= 0x8:0xffffffff804aff9d
> > stack pointer	        = 0x10:0xffffffff9fae3d20
> > frame pointer	        = 0x10:0xffffffff9fae3e80
> > code segment		= base 0x0, limit 0xfffff, type 0x1b
> > 			= DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags	= resume, IOPL = 0
> > current process		= 904 (qemu-system-x86_64)
> > trap number		= 12
> > panic: page fault
> > cpuid = 0
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> > panic() at panic+0x17a
> > trap_fatal() at trap_fatal+0x29f
> > trap() at trap+0x242
> > calltrap() at calltrap+0x8
> > --- trap 0xc, rip = 0xffffffff804aff9d, rsp = 0xffffffff9fae3d20, rbp = 0xffffffff9fae3e80 ---
> > kvprintf() at kvprintf+0x11ed
> > printf() at printf+0xa4
> > uart_z8530_class() at 0x3386
> > swapb.6687() at swapb.6687+0x13f
> > Uptime: 19m14s
> > Physical memory: 986 MB
> > Dumping 113 MB: (CTRL-C to abort)  98 82 66 (CTRL-C to abort)  50 34 18 2
> > 
> > #0  doadump () at pcpu.h:194
> > 194		__asm __volatile("movq %%gs:0,%0" : "=r" (td));
> > (kgdb) bt
> > #0  doadump () at pcpu.h:194
> > #1  0xffffffff80484b18 in boot (howto=260) at ../../../kern/kern_shutdown.c:409
> > #2  0xffffffff80484f77 in panic (fmt=Variable "fmt" is not available.
> > ) at ../../../kern/kern_shutdown.c:563
> > #3  0xffffffff8070de6f in trap_fatal (frame=0xc, eva=Variable "eva" is not available.
> > )
> >     at ../../../amd64/amd64/trap.c:697
> > #4  0xffffffff8070eb62 in trap (frame=0xffffffff9fae3c70)
> >     at ../../../amd64/amd64/trap.c:248
> > #5  0xffffffff806f3e0e in calltrap () at ../../../amd64/amd64/exception.S:169
> > #6  0xffffffff804aff9d in kvprintf (fmt=0xffffffff807febff "\n", 
> >     func=0xffffffff804b07d0 <putchar>, arg=0xffffffff9fae3e90, radix=10, 
> >     ap=0xffffffff9fae3ec0) at ../../../kern/subr_prf.c:819
> > #7  0xffffffff804b0284 in printf (fmt=Variable "fmt" is not available.
> > ) at ../../../kern/subr_prf.c:314
> > #8  0x0000000000003386 in ?? ()
> > #9  0xffffffff9fae4090 in ?? ()
> > #10 0xffffffff806f4667 in Xtimerint () at apic_vector.S:103
> > Previous frame identical to this frame (corrupt stack?)
> > (kgdb) q
> > iapetus# exit
> > 
> > Script done on Sun Nov 18 19:11:41 2007
> > 
> >  and:
> > 
> > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> > GNU gdb 6.1.1 [FreeBSD]
> > Copyright 2004 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and you are
> > welcome to change it and/or distribute copies of it under certain conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > This GDB was configured as "amd64-marcel-freebsd".
> > 
> > Unread portion of the kernel message buffer:
> > kernel trap 12 with interrupts disabled
> > 
> > 
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = 00
> > fault virtual address	= 0xd
> > fault code		= supervisor read data, page not present
> > instruction pointer	= 0x8:0xffffffff8073d743
> > stack pointer	        = 0x10:0xffffffff9fae4610
> > frame pointer	        = 0x10:0x0
> > code segment		= base 0x0, limit 0xfffff, type 0x1b
> > 			= DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags	= resume, IOPL = 0
> > current process		= 948 (qemu-system-x86_64)
> > trap number		= 12
> > panic: page fault
> > cpuid = 0
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> > panic() at panic+0x17a
> > trap_fatal() at trap_fatal+0x29f
> > dmapbase() at 0xffffff0001080460
> > dmapbase() at 0xffffff00010819c0
> > Uptime: 23m57s
> > Physical memory: 986 MB
> > Dumping 152 MB: 137 121 105 89 73 57 41 25 9
> > 
> > #0  doadump () at pcpu.h:194
> > 194		__asm __volatile("movq %%gs:0,%0" : "=r" (td));
> > (kgdb) bt
> > #0  doadump () at pcpu.h:194
> > #1  0xffffffff80484b18 in boot (howto=260) at ../../../kern/kern_shutdown.c:409
> > #2  0xffffffff80484f77 in panic (fmt=Variable "fmt" is not available.
> > ) at ../../../kern/kern_shutdown.c:563
> > #3  0xffffffff8070de6f in trap_fatal (frame=0xc, eva=Variable "eva" is not available.
> > )
> >     at ../../../amd64/amd64/trap.c:697
> > #4  0xffffff0001080460 in ?? ()
> > #5  0xffffffff80a4d8a0 in lapics ()
> > #6  0xffffff00010819c0 in ?? ()
> > #7  0x0000000000000000 in ?? ()
> > #8  0xffffff0001055600 in ?? ()
> > #9  0xffffffff9fae44e0 in ?? ()
> > #10 0xffffffff8044ffed in hardclock_cpu (usermode=Variable "usermode" is not available.
> > )
> >     at ../../../kern/kern_clock.c:224
> > #11 0xffffff00010819c0 in ?? ()
> > #12 0x0000000000000000 in ?? ()
> > #13 0xffffff000215b000 in ?? ()
> > #14 0xffffffff9fae4610 in ?? ()
> > #15 0xffffff000215b000 in ?? ()
> > #16 0x0000000000000000 in ?? ()
> > #17 0xffffffff80a26430 in main_console ()
> > #18 0x00000000000213bf in ?? ()
> > #19 0xffffff00010819c0 in ?? ()
> > #20 0x0000000000000000 in ?? ()
> > ---Type <return> to continue, or q <return> to quit---
> > #21 0x0000000000000000 in ?? ()
> > #22 0xffffffff80a2fd78 in runq ()
> > #23 0xffffff000215b000 in ?? ()
> > #24 0x0000000000000001 in ?? ()
> > #25 0xffffffff8047953c in _mtx_lock_spin (m=0xffffffff80a26430, tid=136126, 
> >     opts=Variable "opts" is not available.
> > ) at cpufunc.h:343
> > Previous frame inner to this frame (corrupt stack?)
> > (kgdb) q
> > iapetus# exit
> > 
> >  kgdb still seems to be kind of confused tho, afaict runq is a variable
> > not a function...  Anyone can make head or tail of these crashes?
> 
> I would check your hardware for bad RAM, etc.

Well, I doubt its that...  It works when running a up kernel, and it works
on a 6.3beta2 i386 install on the same box with smp.  Also I haven't
seen any crashes on that box yet other than from this amd64 kqemu on the
smp kernel (it also survived building a world and kernel with -j4),
actually I haven't received reports of kqemu/amd64/smp actually working
for anyone.  (do you want to try? :)  I _suspect_ kqemu/amd64 is doing
either things differently than on i386, or differences between the
i386 and amd64 kernels trigger the problem.

 Fwiw, I have a report of kqemu/amd64 crashing the host on a linux smp host
too, tho there only with a windows guest; linux guests (which I was testing)
seem to work there.

 Oh and I left memtest86 running on that box overnight and it found nothing...

 Thanx,
	Juergen

PS: some doc about kqemu:
	http://fabrice.bellard.free.fr/qemu/kqemu-tech.html



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071128235042.GA40147>