Date: Wed, 9 Oct 1996 13:03:19 +0800 (WST) From: Peter Wemm <peter@haywire.dialix.com> To: FreeBSD-gnats-submit@freebsd.org Subject: kern/1744: run queue or proc list smashed 4 times in 2 days Message-ID: <199610090503.NAA02004@newton.dialix.com.au> Resent-Message-ID: <199610090510.WAA06332@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 1744
>Category: kern
>Synopsis: run queue or proc list smashed 4 times in 2 days
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-bugs
>State: open
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Tue Oct 8 22:10:01 PDT 1996
>Last-Modified:
>Originator: Peter Wemm
>Organization:
What, here? :-)
>Release: FreeBSD 2.2-961004-SNAP i386
>Environment:
Vanilla i486 box, 16M, 2 IDE drives and one slow SCSI drive on an AHA1542CF.
FreeBSD newton.dialix.com.au 2.2-961004-SNAP FreeBSD 2.2-961004-SNAP #30: Tue Oct 8 06:34:52 WST 1996 peter@newton.dialix.com.au:/home2/src/sys/compile/NEWTON i386
>Description:
Normally, this is a quiet machine, but it's taken a nose-dive in stability
in the last two days.
It's been faulting like this:
WARNING: / was not properly dismounted.
Fatal trap 12: page fault while in kernel mode
fault virtual address = 0x4
fault code = supervisor write, page not present
instruction pointer = 0x8:0xf01aa108
stack pointer = 0x10:0xefbffe0c
frame pointer = 0x10:0xefbffe30
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = Idle
interrupt mask = net tty bio
panic: page fault
Syncing disks...
Fatal trap 12: page fault while in kernel mode
fault virtual address = 0x10
fault code = supervisor read, page not present
instruction pointer = 0x8:0xf012925a
stack pointer = 0x10:0xefbffc88
frame pointer = 0x10:0xefbffc98
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = Idle
interrupt mask = net tty bio
panic: page fault
dumping to dev 20001, offset 32768
dump 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
In this particular case, it died in cpu_switch about line 364:
/* XX update whichqs? */
btrl %ebx,%edi /* clear q full status */
leal _qs(,%ebx,8),%eax /* select q */
movl %eax,%esi
movl P_FORW(%eax),%ecx /* unlink from front of process
q */
movl P_FORW(%ecx),%edx
movl %edx,P_FORW(%eax)
movl P_BACK(%ecx),%eax
movl %eax,P_BACK(%edx)
^^^^^^^^^^^^^^^^^^^^^^^^^
cmpl P_FORW(%ecx),%esi /* q empty */
je 3f
The backtrace looks like this:
[.. rest of trap processing ..]
#13 0xf01a2ce1 in calltrap ()
#14 0xf010e6bd in tsleep ()
#15 0xf0120327 in sbwait ()
#16 0xf011f0e3 in soreceive ()
#17 0xf0121b90 in recvit ()
#18 0xf0121dff in recvfrom ()
#19 0xf01ab0d3 in syscall ()
#20 0xf01a2d35 in Xsyscall ()
The process that was running was either of:
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
1 176 4 0 2 0 208 0 sbwait SWs ?? 0:00.00 (rwhod)
0 27386 4 1 2 0 148 0 sbwait Ss ?? 0:00.00 (comsat)
This particular kernel is not running any modified code.
The other three dumps were quite similar, but I din't have the disk space
at the time to save them for analysis.
>How-To-Repeat:
I don't think this box is doing anything unusual, apart from cvsup which
makes it sweat a fair bit. (a 6.5MB process on a 16M machine that's doing
other things is hard work :-)
>Fix:
>Audit-Trail:
>Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199610090503.NAA02004>
