From owner-freebsd-hackers@FreeBSD.ORG Tue Nov 27 13:37:17 2007 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D3E3016A420 for ; Tue, 27 Nov 2007 13:37:17 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id BD48513C45D for ; Tue, 27 Nov 2007 13:37:17 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from zion.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by elvis.mu.org (Postfix) with ESMTP id 9DAF81A4D7C; Tue, 27 Nov 2007 05:37:16 -0800 (PST) From: John Baldwin To: freebsd-hackers@freebsd.org Date: Tue, 27 Nov 2007 08:24:55 -0500 User-Agent: KMail/1.9.7 References: <20071118020533.GA57425@saturn.kn-bremen.de> <20071118224345.GA81339@saturn.kn-bremen.de> In-Reply-To: <20071118224345.GA81339@saturn.kn-bremen.de> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200711270824.55839.jhb@freebsd.org> Cc: freebsd-emulation@freebsd.org, Juergen Lock Subject: Re: double panic, and whats apic_cmd? (kqemu crash...) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2007 13:37:17 -0000 On Sunday 18 November 2007 05:43:45 pm Juergen Lock wrote: > On Sun, Nov 18, 2007 at 03:05:33AM +0100, Juergen Lock wrote: > > Ok I finally have an amd64 smp box here that i can play with, and tried > > to reproduce http://www.freebsd.org/cgi/query-pr.cgi?pr=113430 - and I got > > the following crash: > >[...] > > Ok, the crashes seem to be pretty random, I got a few more: > (btw I disabled -DSMP in the kqemu build since it doesn't seem to help, > and it doesn't seem to be used anywhere else. Also I forgot to say > I also have KDB_TRACE and KDB_UNATTENDED in the kernel config. Oh and > I had a few hangs too, and never could get into ddb in those cases...) > > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd". > > Unread portion of the kernel message buffer: > kernel trap 12 with interrupts disabled > > > Fatal trap 12: page fault while in kernel mode > cpuid = 1; apic id = 01 > fault virtual address = 0x246 > fault code = supervisor read instruction, page not present > instruction pointer = 0x8:0x246 > stack pointer = 0x10:0xffffffff9fae4b50 > frame pointer = 0x10:0xffffffff9fae4b80 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 11 (idle: cpu1) > trap number = 12 > <0> > > > Fatal trap 12: page fault while in kernel mode > cpuid = 1; apic id = 01 > fault virtual address = 0xc011dbfb > fault code = supervisor read instruction, page not present > instruction pointer = 0x8:0xc011dbfb > stack pointer = 0x10:0xffffffff9fae47d0 > frame pointer = 0x10:0x801de4000 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = trace trap, interrupt enabled, nested task, IOPL = 3 > current process = 11 (idle: cpu1) > trap number = 12 > panic: page fault > cpuid = 1 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > panic() at panic+0x17a > trap_fatal() at trap_fatal+0x29f > trap_pfault() at trap_pfault+0x294 > trap() at trap+0x2ea > sendsig() at sendsig+0x2aa > sched_choose() at sched_choose+0x8c > choosethread() at choosethread+0x2b > sched_switch() at sched_switch+0x184 > mi_switch() at mi_switch+0x189 > ast() at ast+0x1e8 > doreti_ast() at doreti_ast+0x1f > Uptime: 37m8s > Physical memory: 986 MB > Dumping 152 MB: 137 121 105 89 73 57 41 25 9 > > #0 doadump () at pcpu.h:194 > 194 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); > (kgdb) bt > #0 doadump () at pcpu.h:194 > #1 0xffffffff80484b18 in boot (howto=260) at ../../../kern/kern_shutdown.c:409 > #2 0xffffffff80484f77 in panic (fmt=Variable "fmt" is not available. > ) at ../../../kern/kern_shutdown.c:563 > #3 0xffffffff8070de6f in trap_fatal (frame=0xc, eva=Variable "eva" is not available. > ) > at ../../../amd64/amd64/trap.c:697 > #4 0xffffffff8070e254 in trap_pfault (frame=0xffffffff9fae4720, usermode=0) > at ../../../amd64/amd64/trap.c:614 > #5 0xffffffff8070ec0a in trap (frame=0xffffffff9fae4720) > at ../../../amd64/amd64/trap.c:383 > #6 0xffffffff806fcd4a in sendsig (catcher=0x405460, ksi=Variable "ksi" is not available. > ) > at ../../../amd64/amd64/machdep.c:326 > #7 0xffffffff804a16ec in sched_choose () at ../../../kern/sched_4bsd.c:1256 > #8 0xffffffff804a174b in choosethread () at kern_switch.c:137 > #9 0xffffffff804a2984 in sched_switch (td=0xffffff000209b680, > newtd=0xffffff00021a18c0, flags=13) at ../../../kern/sched_4bsd.c:907 > #10 0xffffffff8048cc99 in mi_switch (flags=2, newtd=0x0) > at ../../../kern/kern_synch.c:442 > #11 0xffffffff804b7068 in ast (framep=0xffffffff9fae4c70) > at ../../../kern/subr_trap.c:239 > #12 0xffffffff806f4999 in doreti_ast () at ../../../amd64/amd64/exception.S:468 > #13 0x0000000811d87d74 in ?? () > #14 0x0000000000000005 in ?? () > #15 0x00000000000010e0 in ?? () > ---Type to continue, or q to quit--- > #16 0x0000000811d87d8c in ?? () > #17 0x0000000801de4000 in ?? () > #18 0x0000000741e00000 in ?? () > #19 0x000000000215dd30 in ?? () > #20 0x0000000000d49160 in ?? () > #21 0x00000000c016fdf0 in ?? () > #22 0x0000000000000000 in ?? () > #23 0x0000000801de84d0 in ?? () > #24 0xffffffffbfffffff in ?? () > #25 0x0000000000063fff in ?? () > #26 0x0000000801de4000 in ?? () > #27 0x0000000000063fff in ?? () > #28 0x0000000000000016 in ?? () > #29 0x0000000000000000 in ?? () > #30 0x0000000000000000 in ?? () > #31 0x0000000000000000 in ?? () > #32 0x000000000215dd0c in ?? () > #33 0x000000000000002b in ?? () > #34 0x0000000000000286 in ?? () > #35 0x00007fffffffb608 in ?? () > #36 0x0000000000000023 in ?? () > #37 0x0000000000000000 in ?? () > #38 0x0000000000000000 in ?? () > ---Type to continue, or q to quit--- > #39 0x0000000000c9f000 in ?? () > #40 0x00000000fffffffd in ?? () > #41 0xffffff0001080460 in ?? () > #42 0xffffff000209b680 in ?? () > #43 0x0000000000000001 in ?? () > #44 0xffffffff9fae4bb0 in ?? () > #45 0xffffffff9fae4b68 in ?? () > #46 0xffffff00010819c0 in ?? () > #47 0xffffffff804a2984 in sched_switch (td=0xd49160, newtd=0x63fff, > flags=409599) at ../../../kern/sched_4bsd.c:907 > Previous frame inner to this frame (corrupt stack?) > (kgdb) q > iapetus# exit > > and > > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd". > > Unread portion of the kernel message buffer: > kernel trap 12 with interrupts disabled > > > Fatal trap 0: while in kernel mode > cpuid = 0; apic id = 00 > instruction pointer = 0x4300:0xffffffff9fae41c0 > stack pointer = 0x10:0xffffffff9fae4190 > frame pointer = 0x10:0x5 > code segment = base 0x0, limit 0x0, type 0x0 > = DPL 0, pres 0, long 0, def32 0, gran 0 > processor eflags = resume, IOPL = 0 > current process = 904 (qemu-system-x86_64) > trap number = kernel trap 12 with interrupts disabled > > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x46 > fault code = supervisor read data, page not present > instruction pointer = 0x8:0xffffffff804aff9d > stack pointer = 0x10:0xffffffff9fae3d20 > frame pointer = 0x10:0xffffffff9fae3e80 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 904 (qemu-system-x86_64) > trap number = 12 > panic: page fault > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > panic() at panic+0x17a > trap_fatal() at trap_fatal+0x29f > trap() at trap+0x242 > calltrap() at calltrap+0x8 > --- trap 0xc, rip = 0xffffffff804aff9d, rsp = 0xffffffff9fae3d20, rbp = 0xffffffff9fae3e80 --- > kvprintf() at kvprintf+0x11ed > printf() at printf+0xa4 > uart_z8530_class() at 0x3386 > swapb.6687() at swapb.6687+0x13f > Uptime: 19m14s > Physical memory: 986 MB > Dumping 113 MB: (CTRL-C to abort) 98 82 66 (CTRL-C to abort) 50 34 18 2 > > #0 doadump () at pcpu.h:194 > 194 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); > (kgdb) bt > #0 doadump () at pcpu.h:194 > #1 0xffffffff80484b18 in boot (howto=260) at ../../../kern/kern_shutdown.c:409 > #2 0xffffffff80484f77 in panic (fmt=Variable "fmt" is not available. > ) at ../../../kern/kern_shutdown.c:563 > #3 0xffffffff8070de6f in trap_fatal (frame=0xc, eva=Variable "eva" is not available. > ) > at ../../../amd64/amd64/trap.c:697 > #4 0xffffffff8070eb62 in trap (frame=0xffffffff9fae3c70) > at ../../../amd64/amd64/trap.c:248 > #5 0xffffffff806f3e0e in calltrap () at ../../../amd64/amd64/exception.S:169 > #6 0xffffffff804aff9d in kvprintf (fmt=0xffffffff807febff "\n", > func=0xffffffff804b07d0 , arg=0xffffffff9fae3e90, radix=10, > ap=0xffffffff9fae3ec0) at ../../../kern/subr_prf.c:819 > #7 0xffffffff804b0284 in printf (fmt=Variable "fmt" is not available. > ) at ../../../kern/subr_prf.c:314 > #8 0x0000000000003386 in ?? () > #9 0xffffffff9fae4090 in ?? () > #10 0xffffffff806f4667 in Xtimerint () at apic_vector.S:103 > Previous frame identical to this frame (corrupt stack?) > (kgdb) q > iapetus# exit > > Script done on Sun Nov 18 19:11:41 2007 > > and: > > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd". > > Unread portion of the kernel message buffer: > kernel trap 12 with interrupts disabled > > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0xd > fault code = supervisor read data, page not present > instruction pointer = 0x8:0xffffffff8073d743 > stack pointer = 0x10:0xffffffff9fae4610 > frame pointer = 0x10:0x0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 948 (qemu-system-x86_64) > trap number = 12 > panic: page fault > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > panic() at panic+0x17a > trap_fatal() at trap_fatal+0x29f > dmapbase() at 0xffffff0001080460 > dmapbase() at 0xffffff00010819c0 > Uptime: 23m57s > Physical memory: 986 MB > Dumping 152 MB: 137 121 105 89 73 57 41 25 9 > > #0 doadump () at pcpu.h:194 > 194 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); > (kgdb) bt > #0 doadump () at pcpu.h:194 > #1 0xffffffff80484b18 in boot (howto=260) at ../../../kern/kern_shutdown.c:409 > #2 0xffffffff80484f77 in panic (fmt=Variable "fmt" is not available. > ) at ../../../kern/kern_shutdown.c:563 > #3 0xffffffff8070de6f in trap_fatal (frame=0xc, eva=Variable "eva" is not available. > ) > at ../../../amd64/amd64/trap.c:697 > #4 0xffffff0001080460 in ?? () > #5 0xffffffff80a4d8a0 in lapics () > #6 0xffffff00010819c0 in ?? () > #7 0x0000000000000000 in ?? () > #8 0xffffff0001055600 in ?? () > #9 0xffffffff9fae44e0 in ?? () > #10 0xffffffff8044ffed in hardclock_cpu (usermode=Variable "usermode" is not available. > ) > at ../../../kern/kern_clock.c:224 > #11 0xffffff00010819c0 in ?? () > #12 0x0000000000000000 in ?? () > #13 0xffffff000215b000 in ?? () > #14 0xffffffff9fae4610 in ?? () > #15 0xffffff000215b000 in ?? () > #16 0x0000000000000000 in ?? () > #17 0xffffffff80a26430 in main_console () > #18 0x00000000000213bf in ?? () > #19 0xffffff00010819c0 in ?? () > #20 0x0000000000000000 in ?? () > ---Type to continue, or q to quit--- > #21 0x0000000000000000 in ?? () > #22 0xffffffff80a2fd78 in runq () > #23 0xffffff000215b000 in ?? () > #24 0x0000000000000001 in ?? () > #25 0xffffffff8047953c in _mtx_lock_spin (m=0xffffffff80a26430, tid=136126, > opts=Variable "opts" is not available. > ) at cpufunc.h:343 > Previous frame inner to this frame (corrupt stack?) > (kgdb) q > iapetus# exit > > kgdb still seems to be kind of confused tho, afaict runq is a variable > not a function... Anyone can make head or tail of these crashes? I would check your hardware for bad RAM, etc. -- John Baldwin