Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 1 Oct 2000 03:07:16 -0400
From:      Andrew J Caines <A.J.Caines@altavista.net>
To:        stable@FreeBSD.ORG
Subject:   Re: More panics (different hardware)
Message-ID:  <20001001030716.A384@hal9000.bsdonline.org>
In-Reply-To: <92759.970208180@winston.osd.bsdi.com>; from jkh@winston.osd.bsdi.com on Thu, Sep 28, 2000 at 11:16:20PM -0700
References:  <A.J.Caines@altavista.net> <92759.970208180@winston.osd.bsdi.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Jordan and list,

> If you could get a kernel crash dump, especially with a kernel with
> debugging symbols, that would help enormously!  Thanks.

For better or worse, my box just obliged with a crash only 3h41m28s after
booting my "DEBUG" kernel.

I have found at least one interesting factor in the crashes. Searching my
logs for timestatms associated with the crashes, I see...

hal9000:/root# awk '/The FreeBSD Project/{print $1" "$2"\t"$3}' /var/log/messages{.1,.0,}
Sep 9   23:34:24
Sep 10  16:33:17
Sep 10  16:51:09
Sep 11  02:47:41
Sep 20  20:12:51
Sep 20  20:17:06
Sep 21  02:02:07
Sep 22  02:02:16
Sep 22  19:51:15
Sep 23  02:11:15
Sep 24  02:11:53
Sep 24  02:19:24
Sep 25  02:10:45
Sep 26  02:10:56
Sep 26  18:52:44
Sep 27  02:11:00
Sep 27  23:45:23
Sep 28  02:10:37
Sep 29  02:10:33
Sep 30  02:10:32
Sep 30  22:26:08
Oct 1   02:10:50

You'll notice the remarkable number of crashes at or around 02:10. The
only thing which runs regularly around then is "periodic daily", which
starts at 01:59. I was sitting here while the disks rumbled away and after
a while the system dived.

While I would usually, think this is a hardware issue - heating from the
overactive disks upsetting the memory or whatever, this system builds
world at least weekly and has never crashing during that time. The build
uses all three disks and, of course, hits them pretty hard. Sometimes I
build a few ports at the same time and there has never been a complaint.


Here's what I got from the core.

Script started on Sun Oct  1 02:16:48 2000
hal9000:/root# cd /usr/obj/home/src/sys/DEBUG
hal9000:DEBUG# gdb -k kernel.debug /var/crash/vmcore.0
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...
IdlePTD 3149824
initial pcb at 28b860
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address	= 0x6c
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xc0175772
stack pointer	        = 0x10:0xc7676db4
frame pointer	        = 0x10:0xc7676dd4
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 973 (tee)
interrupt mask		= none
trap number		= 12
panic: page fault

syncing disks... 182 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 
giving up on 5 buffers
Uptime: 3h41m28s

dumping to dev #ad/0x20001, offset 327680
dump ata0: resetting devices .. done
96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 
---
#0  boot (howto=256) at /home/src/sys/kern/kern_shutdown.c:302
302			dumppcb.pcb_cr3 = rcr3();
(kgdb) symbol-file kernel.debug
Load new symbol table from "kernel.debug"? (y or n) y

Reading symbols from kernel.debug...done.
(kgdb) exec-file /var/crash/kernel.0
(kgdb) core-file /var/crash/vmcore.0
IdlePTD 3149824
initial pcb at 28b860
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address	= 0x6c
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xc0175772
stack pointer	        = 0x10:0xc7676db4
frame pointer	        = 0x10:0xc7676dd4
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 973 (tee)
interrupt mask		= none
trap number		= 12
panic: page fault

syncing disks... 182 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 
giving up on 5 buffers
Uptime: 3h41m28s

dumping to dev #ad/0x20001, offset 327680
dump ata0: resetting devices .. done
96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 
---
#0  boot (howto=256) at /home/src/sys/kern/kern_shutdown.c:302
302			dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=256) at /home/src/sys/kern/kern_shutdown.c:302
#1  0xc01419ec in poweroff_wait (junk=0xc02410cf, howto=-949627008)
    at /home/src/sys/kern/kern_shutdown.c:552
#2  0xc020aef2 in trap_fatal (frame=0xc7676d74, eva=108)
    at /home/src/sys/i386/i386/trap.c:951
#3  0xc020abb9 in trap_pfault (frame=0xc7676d74, usermode=0, eva=108)
    at /home/src/sys/i386/i386/trap.c:844
#4  0xc020a78f in trap (frame={tf_fs = -949288944, tf_es = -949551088, 
      tf_ds = -949551088, tf_edi = -949522828, tf_esi = -949522944, 
      tf_ebp = -949522988, tf_isp = -949523040, tf_ebx = -950285472, 
      tf_edx = 0, tf_ecx = 27, tf_eax = -949523008, tf_trapno = 12, 
      tf_err = 0, tf_eip = -1072212110, tf_cs = 8, tf_eflags = 66199, 
      tf_esp = -949523008, tf_ss = 0}) at /home/src/sys/i386/i386/trap.c:443
#5  0xc0175772 in fdesc_setattr (ap=0xc7676e00) at vnode_if.h:305
#6  0xc0173d08 in vn_open (ndp=0xc7676ed0, fmode=1026, cmode=416)
    at vnode_if.h:305
#7  0xc016fc1b in open (p=0xc765d780, uap=0xc7676f80)
    at /home/src/sys/kern/vfs_syscalls.c:989
#8  0xc020b10e in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
      tf_edi = 2, tf_esi = 0, tf_ebp = -1077936848, tf_isp = -949522476, 
      tf_ebx = -1077936760, tf_edx = 521, tf_ecx = 4, tf_eax = 5, 
      tf_trapno = 12, tf_err = 2, tf_eip = 671963196, tf_cs = 31, 
      tf_eflags = 598, tf_esp = -1077936908, tf_ss = 47})
---Type <return> to continue, or q <return> to quit---
    at /home/src/sys/i386/i386/trap.c:1150
#9  0xc01ffe25 in Xint0x80_syscall ()
#10 0x804860f in ?? ()
(kgdb) quit
hal9000:DEBUG# exit

Script done on Sun Oct  1 02:19:15 2000


Any insighti, suggestions or requests for more info are welcome.


-Andrew-
-- 
 _______________________________________________________________________
| -Andrew J. Caines-   Unix Systems Engineer   A.J.Caines@altavista.net |


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001001030716.A384>