Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Sep 2001 12:14:04 -0700 (PDT)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        hackers@freebsd.org
Cc:        Alfred Perlstein <bright@mu.org>, Bruce Evans <bde@zeta.org.au>, Poul-Henning Kamp <phk@critter.freebsd.dk>, Julian Elischer <julian@elischer.org>
Subject:   VM Corruption - stumped, anyone have any ideas?
Message-ID:  <200109241914.f8OJE4l95477@earth.backplane.com>

next in thread | raw e-mail | index | archive | help
    A number of people have been seeing these on STABLE:

	panic: vm_page_remove(): page not found in hash

    They appear to be reproducable after a random period of time on 
    certain machines.  I tracked the problem down to corruption in 
    the vm_page_array but I cannot figure out what the cause of the
    corruption is.

    I would appreciate it if people could look at the following structural
    and hex dump of a corrupted vm_page_t.  Does any recognize the subsystem
    the data is coming from?

    This is stumping me, and it appears to be rather serious.  I can't 
    reproduce it myself.  The only other hint I have is from Mike Tancsa's
    messing around... when he bumps up the number of Apache children forked
    the problem appears to be easier to trigger.  This also occurs on
    some Yahoo boxes.  I don't think it's bad memory.

						-Matt

$8 = 58630
(kgdb) print vm_page_buckets[$8]
$9 = (struct vm_page *) 0xc08428cc
(kgdb) print *vm_page_buckets[$8]
$10 = {pageq = {tqe_next = 0xd715000, tqe_prev = 0x1}, hnext = 0xc0e26a34, 
  listq = {tqe_next = 0xc0e26a3c, tqe_prev = 0xb00000}, object = 0x10015, 
  pindex = 0, phys_addr = 255, md = {pv_list_count = -1066105816, pv_list = {
      tqh_first = 0xc09616b0, tqh_last = 0x0}}, queue = 0, flags = 0,
  pc = 5820, wire_count = 49302, hold_count = 9152, act_count = 151 '\227',
  busy = 214 '\xd6', valid = 1 '\001', dirty = 0 '\000'}
 
    tqe_prev is garbage.  phys_addr is garbage.  It's almost all garbage.  
    The question is: how did it become garbage?  The vm_page_t is a valid
    page in the preallocated vm_page_array[].  The VM system is physically
    incapable of corrupting a vm_page_t this badly.

(kgdb) print vm_page_array_size
$16 = 130743
(kgdb) print m
$17 = 0xc0842acc
(kgdb) print m - vm_page_array
$18 = 55069
(kgdb) print &vm_page_array[55069]
$19 = (struct vm_page *) 0xc0842acc
(kgdb)

0xc08428cc:     0x0d715000      0x00000001      0xc0e26a34      0xc0e26a3c
0xc08428dc:     0x00b00000      0x00010015      0x00000000      0x000000ff
0xc08428ec:     0xc0748428      0xc09616b0      0x00000000      0x00000000
0xc08428fc:     0xc09616bc      0xd69723c0      0x00000001      0x0d716000
0xc084290c:     0x00000000      0x00000000      0xc0842910      0x00800022
0xc084291c:     0x00000016      0x00050000      0x000000ff      0xc0909564
0xc084292c:     0xc0921aec      0x00000000      0x00000000      0xd696aaf8
0xc084293c:     0xd696aae0      0x00000000      0x0d717000      0x00000000
0xc084294c:     0x00000000      0xc084294c      0x00800022      0x00000017
0xc084295c:     0x00050000      0x000000ff      0xc08d3e64      0xc0b6fde4
0xc084296c:     0x00000000      0xc0998430      0xd706ea98      0x00000000
0xc084297c:     0x00000010      0x0d718000      0x00000000      0x00000000
0xc084298c:     0xc0842988      0x00c00019      0x00000018      0x00050000
0xc084299c:     0x00000000      0xc09d9f5c      0xc0691764      0x00000000
0xc08429ac:     0xc083244c      0xc0848fa0      0xc02a6740      0x000001db

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200109241914.f8OJE4l95477>