Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Sep 1997 22:36:41 +0200 (CEST)
From:      Tor Egge <Tor.Egge@idi.ntnu.no>
To:        FreeBSD-gnats-submit@FreeBSD.ORG
Cc:        Tor.Egge@idi.ntnu.no
Subject:   kern/4630: buffer_map might become corrupted
Message-ID:  <199709252036.WAA19179@skarven.itea.ntnu.no>
Resent-Message-ID: <199709252040.NAA11685@hub.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         4630
>Category:       kern
>Synopsis:       buffer_map might become corrupted
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Sep 25 13:40:01 PDT 1997
>Last-Modified:
>Originator:     Tor Egge
>Organization:
Norwegian University of Science and Technology, Trondheim, Norway
>Release:        FreeBSD 3.0-CURRENT i386
>Environment:

FreeBSD skarven.itea.ntnu.no 3.0-CURRENT FreeBSD 3.0-CURRENT #6: Thu Aug 21 21:05:04 CEST 1997     root@skarven.itea.ntnu.no:/usr/src/sys/compile/SKARVEN_SMP  i386

(Using sources from Jun 21 1997)

>Description:


According to the memory dump, buffer_map maintains the kva range 
0xe7021000 to 0xe805d000 in this machine. 

In buffer_map, this bogus sequence of vm_map_entries are present:

addr       prev       next       start      end      inh/eflags offset object
	
0xe314ff80 0xe30eb000 0xe3150a80 0xe7288000 0xe728c000 1 0x10 0x267000 0x0
0xe3150a80 0xe313fac0 0xe3158ec0 0x7ebea000 0x7ecb0000 0 0x0 0x90000 0xe377a180
0xe3158ec0 0xe314f500 0xe3152a40 0x7eb5a000 0x7ebe2000 0 0x0 0x00000 0xe377a180
0xe3152a40 0xe3158ec0 0xe314d940 0xe72a6000 0xe72ce000 1 0x10 0x285000 0x0

mappool is 0xe31050a80, indicating that freed entries are in use.

Using prev entries when traversing the buffer_map reaches the map
entry at 0xe3152a40.

In the vm_map of the currently running process, this related sequence of 
vm_map entries are present:

addr       prev       next       start      end      inh/eflags offset object

0xe31082c0 0xe3cff580 0xe314f500 0x7e8ae000 0x7ea04000 0 0x0    0x0  0xe388f580
0xe314f500 0xe31082c0 0xe313fac0 0x7ea04000 0x7eb5a000 0 0x0    0x0  0xe4001780
0xe313fac0 0xe314f500 0xe30ffa80 0x7eb5a000 0x7ecb0000 0 0x0    0x0  0xe377a180
0xe30ffa80 0xe313fac0 0xe4718d00 0x7ecb0000 0x7ee06000 0 0x0    0x0  0xe41c3700

The running process had 376 files of identical size (1396827 bytes each)
mapped into memory. 

The previous system calls were (based upon program source and map entries)

	msync(0x7EA84000, 32768, MS_ASYNC);
	madvise(0x7EA84000, 32768, MADV_WILLNEED);
	madvise(0x7EA8C000, 32768, MADV_WILLNEED);

	msync(0x7EBDA000, 32768, MS_ASYNC);
	madvise(0x7EBDA000, 32768, MADV_DONTNEED);
	madvise(0x7EBE2000, 32768, MADV_WILLNEED);

	msync(0x7ED30000, 32768, MS_ASYNC);    <--- Current system call

Correlating this with the buffer_map bogusness, it looks like 
something went very wrong during the last madvise call.

If a context switch occured in vm_map_entry_create between

		entry = mappool;

and 
		if (entry) {
			mappool = entry->next;

and mappool was 0xe3150a80 (e.g. during vm_map_madvise calling
vm_map_clip_start) and a different process allocated an entry that was
put on the buffer_map in the meantime, then mappool would be set to
point to a used entry when the process continued. The returned entry
would also point to a used entry. The call to vm_map_clip_end would
grab another entry from the buffer_map (instead of the real free pool).
Then the call to vm_map_simplify_entry would put the entries back
to the free pool (headed by mappool), causing the situation we see
at memory dump time.

----
Fatal trap 12: page fault while in kernel mode
cpunumber = 1
fault virtual address	= 0xdfdfaf88
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xe01c4259
stack pointer	        = 0x10:0xe94b8a64
frame pointer	        = 0x10:0xe94b8a74
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 13976 (domainhelp)
interrupt mask		= 
trap number		= 12
panic: page fault
 cpu#1
boot() called on cpu#1
oops, I'm on cpu#1, I need to be on cpu#0!
[...]
oops, I'm on cpu#1, I need to be on cpu#0!
timeout waiting for cpu #0!

syncing disks... 

Fatal trap 12: page fault while in kernel mode
cpunumber = 1
fault virtual address	= 0xdfdfaf88
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xe01c4259
stack pointer	        = 0x10:0xe94b87a8
frame pointer	        = 0x10:0xe94b87b8
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 13976 (domainhelp)
interrupt mask		= 
trap number		= 12
panic: page fault
 cpu#1
boot() called on cpu#1
oops, I'm on cpu#1, I need to be on cpu#0!
[...]
oops, I'm on cpu#1, I need to be on cpu#0!
timeout waiting for cpu #0!

dumping to dev 401, offset 365144
[...]
----
Current directory is /var/crash/
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (i386-unknown-freebsd), 
Copyright 1996 Free Software Foundation, Inc...
IdlePTD 22b000
current pcb at 20e21c
panic: page fault
#0  boot (howto=260) at ../../kern/kern_shutdown.c:266
(kgdb) where
#0  boot (howto=260) at ../../kern/kern_shutdown.c:266
#1  0xe0117676 in panic (fmt=0xe01c78cf "page fault")
    at ../../kern/kern_shutdown.c:393
#2  0xe01c8641 in trap_fatal (frame=0xe94b876c) at ../../i386/i386/trap.c:777
#3  0xe01c8074 in trap_pfault (frame=0xe94b876c, usermode=0)
    at ../../i386/i386/trap.c:684
#4  0xe01c7c1b in trap (frame={tf_es = 16, tf_ds = -1073151984, 
      tf_edi = 2126389248, tf_esi = 0, tf_ebp = -380926024, 
      tf_isp = -380926060, tf_ebx = -527800788, tf_edx = -419670684, 
      tf_ecx = 3, tf_eax = 2076552, tf_trapno = 12, tf_err = 0, 
      tf_eip = -535018919, tf_cs = 8, tf_eflags = 66054, tf_esp = -532979104, 
      tf_ss = 512}) at ../../i386/i386/trap.c:319
#5  0xe01c4259 in pmap_qenter (va=2126389248, m=0xe6fc5564, count=2)
    at ../../i386/i386/pmap.c:658
#6  0xe0131c58 in allocbuf (bp=0xe6fc54cc, size=8192)
    at ../../kern/vfs_bio.c:1696
#7  0xe013168a in getblk (vp=0xe3071d80, blkno=1638592, size=8192, slpflag=0, 
    slptimeo=0) at ../../kern/vfs_bio.c:1420
#8  0xe012fe91 in bread (vp=0xe3071d80, blkno=1638592, size=8192, cred=0x0, 
    bpp=0xe94b88b4) at ../../kern/vfs_bio.c:275
#9  0xe01a101c in ffs_update (ap=0xe94b88d8) at ../../ufs/ffs/ffs_inode.c:131
#10 0xe01a4b50 in ffs_fsync (ap=0xe94b8928) at vnode_if.h:1127
#11 0xe01a3473 in ffs_sync (mp=0xe3074600, waitfor=2, cred=0xe25a1500, 
    p=0xe021c670) at vnode_if.h:479
#12 0xe013769f in sync (p=0xe021c670, uap=0x0, retval=0x0)
    at ../../kern/vfs_syscalls.c:480
#13 0xe0117251 in boot (howto=256) at ../../kern/kern_shutdown.c:203
#14 0xe0117676 in panic (fmt=0xe01c78cf "page fault")
    at ../../kern/kern_shutdown.c:393
#15 0xe01c8641 in trap_fatal (frame=0xe94b8a28) at ../../i386/i386/trap.c:777
#16 0xe01c8074 in trap_pfault (frame=0xe94b8a28, usermode=0)
    at ../../i386/i386/trap.c:684
#17 0xe01c7c1b in trap (frame={tf_es = 905969680, tf_ds = 16, 
      tf_edi = 2126389248, tf_esi = 0, tf_ebp = -380925324, 
      tf_isp = -380925360, tf_ebx = -528177268, tf_edx = -419686744, 
      tf_ecx = 3, tf_eax = 2076552, tf_trapno = 12, tf_err = 0, 
      tf_eip = -535018919, tf_cs = 8, tf_eflags = 66054, tf_esp = -527678016, 
      tf_ss = 4096}) at ../../i386/i386/trap.c:319
#18 0xe01c4259 in pmap_qenter (va=2126389248, m=0xe6fc16a8, count=2)
    at ../../i386/i386/pmap.c:658
#19 0xe0131c58 in allocbuf (bp=0xe6fc1610, size=8192)
    at ../../kern/vfs_bio.c:1696
#20 0xe013168a in getblk (vp=0xe37d5c00, blkno=-12, size=8192, slpflag=0, 
    slptimeo=0) at ../../kern/vfs_bio.c:1420
#21 0xe012fe91 in bread (vp=0xe37d5c00, blkno=-12, size=8192, cred=0x0, 
    bpp=0xe94b8bac) at ../../kern/vfs_bio.c:275
#22 0xe01a0bed in ffs_balloc (ip=0xe31a2d00, lbn=66, size=8192, 
    cred=0xe3475c80, bpp=0xe94b8c9c, flags=1) at ../../ufs/ffs/ffs_balloc.c:200
#23 0xe01a41b1 in ffs_write (ap=0xe94b8cc8)
    at ../../ufs/ufs/ufs_readwrite.c:259
#24 0xe01b9d40 in vnode_pager_leaf_putpages (object=0xe41c3700, m=0xe94b8dc0, 
    count=3, sync=0, rtvals=0xe94b8d88) at vnode_if.h:307
#25 0xe01b9afa in vnode_pager_putpages (object=0xe41c3700, m=0xe94b8dc0, 
    count=3, sync=0, rtvals=0xe94b8d88) at ../../vm/vnode_pager.c:878
#26 0xe01b809e in vm_pager_put_pages (object=0xe41c3700, m=0xe94b8dc0, 
    count=3, sync=0, rtvals=0xe94b8d88) at ../../vm/vm_pager.c:199
#27 0xe01b6f72 in vm_pageout_flush (mc=0xe94b8dc0, count=3, sync=0)
    at ../../vm/vm_pageout.c:345
#28 0xe01b43fd in vm_object_page_clean (object=0xe41c3700, start=128, end=136, 
    syncio=0, lockflag=1) at ../../vm/vm_object.c:588
#29 0xe01b1cfb in vm_map_clean (map=0xe3d1a000, start=2127757312, 
    end=2127790080, syncio=0, invalidate=0) at ../../vm/vm_map.c:1801
#30 0xe01b31ab in msync (p=0xe3566800, uap=0xe94b8f94, retval=0xe94b8f84)
    at ../../vm/vm_mmap.c:379
#31 0xe01c892f in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 524288, 
      tf_esi = 32768, tf_ebp = -541076456, tf_isp = -380923932, 
      tf_ebx = 573504, tf_edx = 32768, tf_ecx = 839771, tf_eax = 65, 
      tf_trapno = 7, tf_err = 7, tf_eip = 13893, tf_cs = 31, tf_eflags = 518, 
      tf_esp = -541076484, tf_ss = 39}) at ../../i386/i386/trap.c:930
#32 0x3645 in ?? ()
#33 0x20c0 in ?? ()
#34 0x2863 in ?? ()
#35 0x3038 in ?? ()
#36 0x1095 in ?? ()
(kgdb) up 19
#19 0xe0131c58 in allocbuf (bp=0xe6fc1610, size=8192)
    at ../../kern/vfs_bio.c:1696
(kgdb) print/x *bp
$1 = {b_hash = {le_next = 0xe7018488, le_prev = 0xe021241c}, b_vnbufs = {
    le_next = 0x0, le_prev = 0xe37d5c30}, b_freelist = {tqe_next = 0xe6fc54cc, 
    tqe_prev = 0xe0212998}, b_act = {tqe_next = 0xe6ffa5dc, 
    tqe_prev = 0xe259ff74}, b_proc = 0x0, b_flags = 0x20000030, 
  b_qindex = 0x0, b_usecount = 0x4, b_error = 0x0, b_bufsize = 0x0, 
  b_bcount = 0x0, b_resid = 0x0, b_dev = 0xffffffff, b_un = {
    b_addr = 0x7ebe2000}, b_kvabase = 0x7ebe2000, b_kvasize = 0x2000, 
  b_lblkno = 0xfffffff4, b_blkno = 0xfffffff4, b_iodone = 0x0, 
  b_iodone_chain = 0x0, b_vp = 0xe37d5c00, b_dirtyoff = 0x0, b_dirtyend = 0x0, 
  b_rcred = 0x0, b_wcred = 0x0, b_validoff = 0x0, b_validend = 0x0, 
  b_pblkno = 0x6373d1, b_saveaddr = 0x0, b_savekva = 0x0, b_driver1 = 0x0, 
  b_driver2 = 0x0, b_spc = 0x0, b_cluster = {cluster_head = {
      tqh_first = 0xe6fc5ff8, tqh_last = 0xe700da14}, cluster_entry = {
      tqe_next = 0xe6fc5ff8, tqe_prev = 0xe700da14}}, b_pages = {0xe084a78c, 
    0xe08c45c0, 0x0 <repeats 14 times>}, b_npages = 0x2}
(kgdb) 
----


>How-To-Repeat:

	Have a kernel with bugs (e.g. calling MALLOC with M_WAITOK during
	an interrupt).
	
>Fix:

The unexpected context switch is probably due to tsleep being called
during an interrupt.

A machine dependent `is_in_interrupt()' function can be implemented
as 
	#define is_in_interrupt() (intr_nesting_level >=1)

on a uniprosessor machine. Then tsleep can contain 

	if (is_in_interrupt()) 
		panic("tsleep called in interrupt");

For a multiprocessor machine, a per-cpu version of intr_nesting_level might
be needed in order to implement is_in_interrupt().
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199709252036.WAA19179>