From owner-freebsd-stable@FreeBSD.ORG Mon Jul 28 17:16:51 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5F5421065679; Mon, 28 Jul 2008 17:16:51 +0000 (UTC) (envelope-from mtoth@queldor.net) Received: from queldor.net (queldor.com [216.164.83.38]) by mx1.freebsd.org (Postfix) with ESMTP id 236078FC1A; Mon, 28 Jul 2008 17:16:51 +0000 (UTC) (envelope-from mtoth@queldor.net) Received: from c-71-192-238-70.hsd1.ma.comcast.net ([71.192.238.70] helo=[192.168.1.197]) by queldor.net with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1KNWDd-0009MK-Rk; Mon, 28 Jul 2008 12:09:10 -0500 Message-ID: <488DFEF7.5000802@queldor.net> Date: Mon, 28 Jul 2008 13:16:39 -0400 From: Michael toth User-Agent: Thunderbird 2.0.0.16 (Windows/20080708) MIME-Version: 1.0 To: Kostik Belousov References: <488CACD9.7060002@queldor.net> <488CBB02.1020105@FreeBSD.org> <488CBBAC.7040507@queldor.net> <488CC13F.1020204@FreeBSD.org> <20080727190742.GF97161@deviant.kiev.zoral.com.ua> <488CD9AB.8040401@queldor.net> <20080727204339.GG97161@deviant.kiev.zoral.com.ua> <488CFCC3.2030504@queldor.net> <20080728101840.GK97161@deviant.kiev.zoral.com.ua> In-Reply-To: <20080728101840.GK97161@deviant.kiev.zoral.com.ua> X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Kris Kennaway , Michael Toth , freebsd-stable@freebsd.org Subject: Re: 7.0 Crashing X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jul 2008 17:16:51 -0000 Kostik Belousov wrote: > On Sun, Jul 27, 2008 at 06:54:59PM -0400, Michael Toth wrote: > >> >> Kostik Belousov wrote: >> >>> On Sun, Jul 27, 2008 at 04:25:15PM -0400, Michael toth wrote: >>> >>> >>>> Fatal trap 12: page fault while in kernel mode >>>> cpuid = 4; apic id = 04 >>>> fault virtual address = 0x188 >>>> fault code = supervisor read, page not present >>>> instruction pointer = 0x20:0xc0775284 >>>> stack pointer = 0x28:0xe7d6bad0 >>>> frame pointer = 0x28:0xe7d6bae8 >>>> code segment = base 0x0, limit 0xfffff, type 0x1b >>>> = DPL 0, pres 1, def32 1, gran 1 >>>> processor eflags = interrupt enabled, resume, IOPL = 0 >>>> current process = 4838 (egrep) >>>> trap number = 12 >>>> panic: page fault >>>> cpuid = 4 >>>> Uptime: 1h2m48s >>>> Physical memory: 2035 MB >>>> Dumping 87 MB: 72 56 40 24 8 >>>> >>>> Reading symbols from /boot/kernel/acpi.ko...Reading symbols from >>>> /boot/kernel/acpi.ko.symbols...done. >>>> done. >>>> Loaded symbols for /boot/kernel/acpi.ko >>>> #0 doadump () at pcpu.h:195 >>>> 195 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); >>>> (kgdb) backtrace >>>> #0 doadump () at pcpu.h:195 >>>> #1 0xc0782597 in boot (howto=260) at >>>> /usr/src/sys/kern/kern_shutdown.c:418 >>>> #2 0xc0782859 in panic (fmt=Variable "fmt" is not available. >>>> ) at /usr/src/sys/kern/kern_shutdown.c:572 >>>> #3 0xc0a8b39c in trap_fatal (frame=0xe7d6ba90, eva=392) at >>>> /usr/src/sys/i386/i386/trap.c:899 >>>> #4 0xc0a8b620 in trap_pfault (frame=0xe7d6ba90, usermode=0, eva=392) at >>>> /usr/src/sys/i386/i386/trap.c:812 >>>> #5 0xc0a8bfcc in trap (frame=0xe7d6ba90) at >>>> /usr/src/sys/i386/i386/trap.c:490 >>>> #6 0xc0a71bdb in calltrap () at /usr/src/sys/i386/i386/exception.s:139 >>>> #7 0xc0775284 in _mtx_lock_sleep (m=0xc600d174, tid=3318745216, opts=0, >>>> file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:339 >>>> #8 0xc09a93d7 in vm_fault (map=0xc56b5570, vaddr=671809536, >>>> fault_type=2 '\002', fault_flags=8) at /usr/src/sys/vm/vm_fault.c:293 >>>> #9 0xc0a8b50b in trap_pfault (frame=0xe7d6bd38, usermode=1, >>>> eva=671813488) at /usr/src/sys/i386/i386/trap.c:789 >>>> #10 0xc0a8be57 in trap (frame=0xe7d6bd38) at >>>> /usr/src/sys/i386/i386/trap.c:357 >>>> #11 0xc0a71bdb in calltrap () at /usr/src/sys/i386/i386/exception.s:139 >>>> #12 0x2806e607 in ?? () >>>> Previous frame inner to this frame (corrupt stack?) >>>> (kgdb) up >>>> #1 0xc0782597 in boot (howto=260) at >>>> /usr/src/sys/kern/kern_shutdown.c:418 >>>> 418 doadump(); >>>> (kgdb) up >>>> #2 0xc0782859 in panic (fmt=Variable "fmt" is not available. >>>> ) at /usr/src/sys/kern/kern_shutdown.c:572 >>>> 572 boot(bootopt); >>>> (kgdb) up >>>> #3 0xc0a8b39c in trap_fatal (frame=0xe7d6ba90, eva=392) at >>>> /usr/src/sys/i386/i386/trap.c:899 >>>> 899 panic("%s", trap_msg[type]); >>>> (kgdb) up >>>> #4 0xc0a8b620 in trap_pfault (frame=0xe7d6ba90, usermode=0, eva=392) at >>>> /usr/src/sys/i386/i386/trap.c:812 >>>> 812 trap_fatal(frame, eva); >>>> (kgdb) up >>>> #5 0xc0a8bfcc in trap (frame=0xe7d6ba90) at >>>> /usr/src/sys/i386/i386/trap.c:490 >>>> 490 (void) trap_pfault(frame, FALSE, eva); >>>> (kgdb) up >>>> #6 0xc0a71bdb in calltrap () at /usr/src/sys/i386/i386/exception.s:139 >>>> 139 call trap >>>> Current language: auto; currently asm >>>> (kgdb) up >>>> #7 0xc0775284 in _mtx_lock_sleep (m=0xc600d174, tid=3318745216, opts=0, >>>> file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:339 >>>> 339 owner = (struct thread *)(v & >>>> ~MTX_FLAGMASK); >>>> Current language: auto; currently c >>>> (kgdb) up >>>> #8 0xc09a93d7 in vm_fault (map=0xc56b5570, vaddr=671809536, >>>> fault_type=2 '\002', fault_flags=8) at /usr/src/sys/vm/vm_fault.c:293 >>>> 293 VM_OBJECT_LOCK(fs.first_object); >>>> (kgdb) p fs >>>> $1 = {m = 0x0, object = 0x12, pindex = 13878757899709627520, first_m = >>>> 0xc5f0a8b8, first_object = 0xc600d174, first_pindex = 0, map = >>>> 0xc56b5570, entry = 0xc59fc7f8, lookup_still_valid = 2, vp = 0xc55c5220} >>>> (kgdb) p fs.first_object >>>> $2 = 0xc600d174 >>>> (kgdb) >>>> >>>> >>> Please, show the output of >>> p/x *(fs.first_object) >>> >>> BTW, you have said that you got a lot of the panics. Are backtraces >>> the same for all of them ? >>> >>> >> Here is the p/x *(fs.first_object) .. >> and it appears that vmcore.6 is different (vmcore.6 is new from a few >> hours ago) >> >> So does this point to a hardware issue? >> >> Thanks >> >> # kgdb kernel.debug /var/crash/vmcore.5 >> GNU gdb 6.1.1 [FreeBSD] >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you are >> welcome to change it and/or distribute copies of it under certain >> conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for details. >> This GDB was configured as "i386-marcel-freebsd"... >> >> Unread portion of the kernel message buffer: >> >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 4; apic id = 04 >> fault virtual address = 0x188 >> fault code = supervisor read, page not present >> instruction pointer = 0x20:0xc0775284 >> stack pointer = 0x28:0xe7d6bad0 >> frame pointer = 0x28:0xe7d6bae8 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, def32 1, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 4838 (egrep) >> trap number = 12 >> panic: page fault >> cpuid = 4 >> Uptime: 1h2m48s >> Physical memory: 2035 MB >> Dumping 87 MB: 72 56 40 24 8 >> >> Reading symbols from /boot/kernel/acpi.ko...Reading symbols from >> /boot/kernel/acpi.ko.symbols...done. >> done. >> Loaded symbols for /boot/kernel/acpi.ko >> #0 doadump () at pcpu.h:195 >> 195 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); >> (kgdb) up >> #1 0xc0782597 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 >> 418 doadump(); >> (kgdb) up >> #2 0xc0782859 in panic (fmt=Variable "fmt" is not available. >> ) at /usr/src/sys/kern/kern_shutdown.c:572 >> 572 boot(bootopt); >> (kgdb) up >> #3 0xc0a8b39c in trap_fatal (frame=0xe7d6ba90, eva=392) at >> /usr/src/sys/i386/i386/trap.c:899 >> 899 panic("%s", trap_msg[type]); >> (kgdb) up >> #4 0xc0a8b620 in trap_pfault (frame=0xe7d6ba90, usermode=0, eva=392) at >> /usr/src/sys/i386/i386/trap.c:812 >> 812 trap_fatal(frame, eva); >> (kgdb) up >> #5 0xc0a8bfcc in trap (frame=0xe7d6ba90) at >> /usr/src/sys/i386/i386/trap.c:490 >> 490 (void) trap_pfault(frame, FALSE, eva); >> (kgdb) up >> #6 0xc0a71bdb in calltrap () at /usr/src/sys/i386/i386/exception.s:139 >> 139 call trap >> Current language: auto; currently asm >> (kgdb) up >> #7 0xc0775284 in _mtx_lock_sleep (m=0xc600d174, tid=3318745216, opts=0, >> file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:339 >> 339 owner = (struct thread *)(v & >> ~MTX_FLAGMASK); >> Current language: auto; currently c >> (kgdb) up >> #8 0xc09a93d7 in vm_fault (map=0xc56b5570, vaddr=671809536, >> fault_type=2 '\002', fault_flags=8) at /usr/src/sys/vm/vm_fault.c:293 >> 293 VM_OBJECT_LOCK(fs.first_object); >> (kgdb) p/x *(fs.first_object) >> $1 = {mtx = {lock_object = {lo_name = 0x0, lo_type = 0x0, lo_flags = >> 0x0, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = >> 0x0}}, mtx_lock = 0x0, mtx_recurse = 0x0}, object_list = {tqe_next = 0x0, >> tqe_prev = 0xc5f3a300}, shadow_head = {lh_first = 0x0}, shadow_list >> = {le_next = 0xc5f3a2e8, le_prev = 0xc55c39d0}, memq = {tqh_first = 0x0, >> tqh_last = 0xc600d1a0}, root = 0x0, size = 0x1, generation = 0x1, >> ref_count = 0x1, shadow_count = 0x0, type = 0x0, flags = 0x2000, >> pg_color = 0x0, paging_in_progress = 0x0, resident_page_count = 0x0, >> backing_object = 0xc55c39b0, backing_object_offset = 0xf000, >> pager_object_list = { >> tqe_next = 0x0, tqe_prev = 0x0}, cache = 0x0, handle = 0x0, un_pager >> = {vnp = {vnp_size = 0x0}, devp = {devp_pglist = {tqh_first = 0x0, >> tqh_last = 0x0}}, swp = {swp_bcount = 0x0}}} >> (kgdb) >> >> >> >> >> # kgdb kernel.debug /var/crash/vmcore.6 >> GNU gdb 6.1.1 [FreeBSD] >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you are >> welcome to change it and/or distribute copies of it under certain >> conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for details. >> This GDB was configured as "i386-marcel-freebsd"... >> >> Unread portion of the kernel message buffer: >> TPTE at 0xbfefeffc IS ZERO @ VA bfbff000 >> panic: bad pte >> cpuid = 2 >> Uptime: 4h12m47s >> Physical memory: 2035 MB >> Dumping 121 MB: 106 90 74 58 42 26 10 >> >> Reading symbols from /boot/kernel/acpi.ko...Reading symbols from >> /boot/kernel/acpi.ko.symbols...done. >> done. >> Loaded symbols for /boot/kernel/acpi.ko >> #0 doadump () at pcpu.h:195 >> 195 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); >> (kgdb) bt >> #0 doadump () at pcpu.h:195 >> #1 0xc0782597 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 >> #2 0xc0782859 in panic (fmt=Variable "fmt" is not available. >> ) at /usr/src/sys/kern/kern_shutdown.c:572 >> #3 0xc0a86f26 in pmap_remove_pages (pmap=0xc5527c54) at >> /usr/src/sys/i386/i386/pmap.c:3093 >> #4 0xc09b294c in vmspace_exit (td=0xc5fb9aa0) at >> /usr/src/sys/vm/vm_map.c:404 >> #5 0xc075e780 in exit1 (td=0xc5fb9aa0, rv=0) at >> /usr/src/sys/kern/kern_exit.c:294 >> #6 0xc075fadd in sys_exit (td=Could not find the frame base for "sys_exit". >> ) at /usr/src/sys/kern/kern_exit.c:98 >> #7 0xc0a8b975 in syscall (frame=0xe7e6cd38) at >> /usr/src/sys/i386/i386/trap.c:1035 >> #8 0xc0a71c40 in Xint0x80_syscall () at >> /usr/src/sys/i386/i386/exception.s:196 >> #9 0x00000033 in ?? () >> Previous frame inner to this frame (corrupt stack?) >> (kgdb) >> > > Both panics you shown are caused by zeroing kernel memory in what seems to > be random locations. This might be caused by the bug, but I would suggest > first making the thorough test of the hardware. > I had someone run a Dell Diags CD on the machine and it passed all tests. Before that it core'd again; here is the backtrace from that one. Is there any other want (maybe in freebsd) to test the hardware better? and/or should I submit a bug report for this? Thanks for all the help # kgdb kernel.debug /var/crash/vmcore.7 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: TPTE at 0xbfca027c IS ZERO @ VA 2809f000 panic: bad pte cpuid = 0 Uptime: 17h17m50s Physical memory: 2035 MB Dumping 222 MB: 207 191 175 159 143 127 111 95 79 63 47 31 15 Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done. done. Loaded symbols for /boot/kernel/acpi.ko #0 doadump () at pcpu.h:195 195 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) backtrace #0 doadump () at pcpu.h:195 #1 0xc0782597 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc0782859 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:572 #3 0xc0a86f26 in pmap_remove_pages (pmap=0xc5bf142c) at /usr/src/sys/i386/i386/pmap.c:3093 #4 0xc09b294c in vmspace_exit (td=0xc5f2a440) at /usr/src/sys/vm/vm_map.c:404 #5 0xc075e780 in exit1 (td=0xc5f2a440, rv=0) at /usr/src/sys/kern/kern_exit.c:294 #6 0xc075fadd in sys_exit (td=Could not find the frame base for "sys_exit". ) at /usr/src/sys/kern/kern_exit.c:98 #7 0xc0a8b975 in syscall (frame=0xe7e9bd38) at /usr/src/sys/i386/i386/trap.c:1035 #8 0xc0a71c40 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:196 #9 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) q -- -- [ Queldor ] (Warning: This message may cause you to understand something)