Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Oct 2005 23:20:06 -0700
From:      "Frank Mayhar" <frank@exit.com>
To:        "FreeBSD gnats submit" <FreeBSD-gnats-submit@FreeBSD.org>
Subject:   kern/87967: Page fault panic on NFS client.
Message-ID:  <1130221206.0@realtime.exit.com>
Resent-Message-ID: <200510250630.j9P6UKsF035520@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         87967
>Category:       kern
>Synopsis:       Page fault panic on NFS client.
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Oct 25 06:30:20 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator:     Frank Mayhar
>Release:        FreeBSD 6.0-RC1 i386
>Organization:
Exit Consulting 
>Environment:


System: FreeBSD 6.0-RC1 #1: Sun Oct 23 09:38:36 PDT 2005
    frank@jill.exit.com:/usr/obj/usr/src/sys/REALTIME

Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 6.0-RC1 #1: Sun Oct 23 09:38:36 PDT 2005
    frank@jill.exit.com:/usr/obj/usr/src/sys/REALTIME
ACPI APIC Table: <PTLTD          APIC  >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) MP 1900+ (1600.07-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x662  Stepping = 2
  Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
  AMD Features=0xc0480800<SYSCALL,MP,MMX+,3DNow+,3DNow>
real memory  = 2146959360 (2047 MB)
avail memory = 2091614208 (1994 MB)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  1
 cpu1 (AP): APIC ID:  0
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-23 on motherboard
bktr_mem: memory holder loaded
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <PTLTD   RSDT> on motherboard
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
pci_link0: <ACPI PCI Link LNKA> irq 10 on acpi0
pci_link1: <ACPI PCI Link LNKB> irq 5 on acpi0
pci_link2: <ACPI PCI Link LNKC> irq 11 on acpi0
pci_link3: <ACPI PCI Link LNKD> irq 10 on acpi0
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff,0x8000-0x807f,0x8080-0x80ff iomem 0xd8000-0xdbfff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
nvidia0: <GeForce 6800 GT> mem 0xd2000000-0xd2ffffff,0xe0000000-0xefffffff,0xd1000000-0xd1ffffff irq 18 at device 5.0 on pci1
nvidia0: [GIANT-LOCKED]
isab0: <PCI-ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <AMD 768 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 7.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
pci0: <bridge> at device 7.3 (no driver attached)
ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0x1000-0x10ff mem 0xd0060000-0xd0060fff irq 20 at device 8.0 on pci0
ahc0: [GIANT-LOCKED]
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
em0: <Intel(R) PRO/1000 Network Connection, Version - 2.1.7> port 0x1400-0x143f mem 0xd0040000-0xd005ffff,0xd0000000-0xd003ffff irq 21 at device 9.0 on pci0
em0: Ethernet address: 00:04:23:a6:1e:38
em0:  Speed:N/A  Duplex:N/A
pcib2: <ACPI PCI-PCI bridge> at device 16.0 on pci0
pci2: <ACPI PCI bus> on pcib2
ohci0: <OHCI (generic) USB controller> mem 0xd3004000-0xd3004fff irq 19 at device 0.0 on pci2
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
pcm0: <Creative Audigy (EMU10K2)> port 0x2080-0x209f irq 17 at device 5.0 on pci2
pcm0: <SigmaTel STAC9721/23 AC97 Codec>
fwohci0: <1394 Open Host Controller Interface> mem 0xd3007000-0xd30077ff,0xd3000000-0xd3003fff irq 18 at device 5.2 on pci2
fwohci0: OHCI version 1.10 (ROM=0)
fwohci0: No. of Isochronous channels is 4.
fwohci0: EUI64 00:02:3c:00:50:00:60:21
fwohci0: Phy 1394a available S400, 2 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0: <IEEE1394(FireWire) bus> on fwohci0
fwohci0: Initiate bus reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
bktr0: <BrookTree 878> mem 0xf0000000-0xf0000fff irq 18 at device 6.0 on pci2
bktr0: [GIANT-LOCKED]
bktr0: Hauppauge Model 37281 B223
bktr0: Detected a MSP3430G-A1 at 0x80
bktr0: Detected a DPL3418A-A2 at 0x84
bktr0: Hauppauge WinCast/TV, Philips FR1236 NTSC FM tuner, msp3400c stereo, dpl3518a dolby, remote control.
pci2: <multimedia> at device 6.1 (no driver attached)
ohci1: <NEC uPD 9210 USB controller> mem 0xd3005000-0xd3005fff irq 19 at device 7.0 on pci2
ohci1: [GIANT-LOCKED]
usb1: OHCI version 1.0
usb1: <NEC uPD 9210 USB controller> on ohci1
usb1: USB revision 1.0
uhub1: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 3 ports with 3 removable, self powered
ohci2: <NEC uPD 9210 USB controller> mem 0xd3006000-0xd3006fff irq 16 at device 7.1 on pci2
ohci2: [GIANT-LOCKED]
usb2: OHCI version 1.0
usb2: <NEC uPD 9210 USB controller> on ohci2
usb2: USB revision 1.0
uhub2: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
ehci0: <NEC uPD 720100 USB 2.0 controller> mem 0xd3007800-0xd30078ff irq 17 at device 7.2 on pci2
ehci0: [GIANT-LOCKED]
usb3: EHCI version 1.0
usb3: companion controllers, 3 ports each: usb1 usb2
usb3: <NEC uPD 720100 USB 2.0 controller> on ehci0
usb3: USB revision 2.0
uhub3: NEC EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub3: 5 ports with 5 removable, self powered
xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0x2000-0x207f mem 0xd3007c00-0xd3007c7f irq 19 at device 8.0 on pci2
miibus0: <MII bus> on xl0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
xl0: Ethernet address: 00:e0:81:20:cc:de
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model Generic PS/2 mouse, device ID 0
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A, console
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FAST]
orm0: <ISA Option ROMs> at iomem 0xc0000-0xce7ff,0xdc000-0xdffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x100>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
uhub2: device problem (SET_ADDR_FAILED), disabling port 2
Timecounters tick every 1.000 msec
Waiting 15 seconds for SCSI devices to settle
uhub2: device problem (SET_ADDR_FAILED), disabling port 2
uhub2: port error, restarting port 2
uhub2: device problem (SET_ADDR_FAILED), disabling port 2
uhub2: port error, restarting port 2
uhub2: device problem (SET_ADDR_FAILED), disabling port 2
uhub2: port error, restarting port 2
da0 at ahc0 bus 0 target 0 lun 0
da0: <FUJITSU MAF3364L SUN36G 1213> Fixed Direct Access SCSI-2 device 
da0: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da0: 34732MB (71132959 512 byte sectors: 64H 32S/T 34732C)
da2 at ahc0 bus 0 target 2 lun 0
da2: <HP 36.4GB A 80-F313 > Fixed Direct Access SCSI-2 device 
da2: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da2: 34732MB (71132960 512 byte sectors: 64H 32S/T 34732C)
da8 at ahc0 bus 0 target 8 lun 0
da8: <QUANTUM ATLAS10K2-TY367J DA40> Fixed Direct Access SCSI-3 device 
da8: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabled
da8: 34732MB (71132959 512 byte sectors: 64H 32S/T 34732C)
da9 at ahc0 bus 0 target 9 lun 0
da9: <IBM IC35L036UWD210-0 S5BS> Fixed Direct Access SCSI-3 device 
da9: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da9: 35003MB (71687340 512 byte sectors: 64H 32S/T 35003C)
SMP: AP CPU #1 Launched!



>Description:


This has happened twice in 24 hours, both times under load.  Both times it was building OpenOffice in an NFS-mounted /usr/ports.  In case it matters, it's a soft mount from another 6.0 box over an em(4) interface with an MTU of 9000.

Both times it was a panic from vnlru while trying to flush a vnode and both times it was a null-pointer dereference in nfs_putpages() at nfs_bio.c:301.  In both cases vp->v_data was null.  The vnode itself looks fine to my eyes, although there may well be FreeBSD-specific subtleties that I'm missing.

Below is the traceback and a dump of the vnode in question.  I'm keeping the cores around, so if there's more information needed, I'll be happy to provide it.

(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc05397b7 in boot (howto=0x104) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xc0539b0d in panic (fmt=0xc06bf3da "%s") at /usr/src/sys/kern/kern_shutdown.c:555
#3  0xc06a0c2a in trap_fatal (frame=0xe7e1c958, eva=0x4) at /usr/src/sys/i386/i386/trap.c:831
#4  0xc06a0937 in trap_pfault (frame=0xe7e1c958, usermode=0x0, eva=0x4) at /usr/src/sys/i386/i386/trap.c:742
#5  0xc06a054d in trap (frame=
      {tf_fs = 0x8, tf_es = 0x28, tf_ds = 0x28, tf_edi = 0x1000, tf_esi = 0x1, tf_ebp = 0xe7e1c9fc, tf_isp = 0xe7e1c984, tf_ebx = 0x0, tf_edx = 0x0, tf_ecx = 0x1, tf_eax = 0x1000, tf_trapno = 0xc, tf_err = 0x0, tf_eip = 0xc05fad40, tf_cs = 0x20, tf_eflags = 0x10246, tf_esp = 0xe7e1cab0, tf_ss = 0xc43f8d20}) at /usr/src/sys/i386/i386/trap.c:432
#6  0xc068de8a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc05fad40 in nfs_putpages (ap=0xe7e1ca20) at /usr/src/sys/nfsclient/nfs_bio.c:301
#8  0xc06aacc9 in VOP_PUTPAGES_APV (vop=0x1000, a=0x0) at vnode_if.c:2164
#9  0xc0667fa4 in vnode_pager_putpages (object=0xc806139c, m=0x1000, count=0x1000, sync=0x5, rtvals=0x1000)
    at vnode_if.h:1119
#10 0xc0662faf in vm_pageout_flush (mc=0xe7e1cab0, count=0x1, flags=0x5) at vm_pager.h:147
#11 0xc065f0d4 in vm_object_page_collect_flush (object=0xc806139c, p=0xc2b921b0, curgeneration=0x1, pagerflags=0x5)
    at /usr/src/sys/vm/vm_object.c:950
#12 0xc065eb44 in vm_object_page_clean (object=0xc806139c, start=0x0, end=Unhandled dwarf expression opcode 0x93
) at /usr/src/sys/vm/vm_object.c:753
#13 0xc065e7e0 in vm_object_terminate (object=0xc806139c) at /usr/src/sys/vm/vm_object.c:608
#14 0xc06662ca in vnode_destroy_vobject (vp=0xc403baa0) at /usr/src/sys/vm/vnode_pager.c:166
#15 0xc05fdd69 in nfs_reclaim (ap=0x1000) at /usr/src/sys/nfsclient/nfs_node.c:247
#16 0xc06aa80c in VOP_RECLAIM_APV (vop=0x1000, a=0x0) at vnode_if.c:1589
#17 0xc059236d in vgonel (vp=0xc403baa0) at vnode_if.h:818
#18 0xc058ed4d in vlrureclaim (mp=0xc3d10000) at /usr/src/sys/kern/vfs_subr.c:611
#19 0xc058f1cf in vnlru_proc () at /usr/src/sys/kern/vfs_subr.c:724
#20 0xc052354d in fork_exit (callout=0xc058efbc <vnlru_proc>, arg=0x0, frame=0xe7e1cd38)
    at /usr/src/sys/kern/kern_fork.c:789
#21 0xc068deec in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208
(kgdb)  print /x *vp
$1 = {
  v_type = 0x1, 
  v_tag = 0xc06deecb, 
  v_op = 0xc0715da0, 
  v_data = 0x0, 
  v_mount = 0xc3d10000, 
  v_nmntvnodes = {
    tqe_next = 0x0, 
    tqe_prev = 0xc7b959a4
  }, 
  v_un = {
    vu_mount = 0x0, 
    vu_socket = 0x0, 
    vu_cdev = 0x0, 
    vu_fifoinfo = 0x0
  }, 
  v_hashlist = {
    le_next = 0x0, 
    le_prev = 0xc3763808
  }, 
  v_hash = 0x11463ccc, 
  v_cache_src = {
    lh_first = 0x0
  }, 
  v_cache_dst = {
    tqh_first = 0xc7b8cd48, 
    tqh_last = 0xc7b8cd58
  }, 
  v_dd = 0x0, 
  v_cstart = 0x0, 
  v_lasta = 0x0, 
  v_lastw = 0x0, 
  v_clen = 0x0, 
  v_lock = {
    lk_interlock = 0xc072c824, 
    lk_flags = 0x40080, 
    lk_sharecount = 0x0, 
    lk_waitcount = 0x0, 
    lk_exclusivecount = 0x1, 
    lk_prio = 0x50, 
    lk_wmesg = 0xc06deecb, 
    lk_timo = 0x33, 
    lk_lockholder = 0xc35add80, 
    lk_newlock = 0x0
  }, 
  v_interlock = {
    mtx_object = {
      lo_class = 0xc0707704, 
      lo_name = 0xc06d8e89, 
      lo_type = 0xc06d8e89, 
      lo_flags = 0x30000, 
      lo_list = {
        tqe_next = 0x0, 
        tqe_prev = 0x0
      }, 
      lo_witness = 0x0
    }, 
    mtx_lock = 0x4, 
    mtx_recurse = 0x0
  }, 
  v_vnlock = 0xc403baf8, 
  v_holdcnt = 0x2, 
  v_usecount = 0x0, 
  v_iflag = 0x480, 
  v_vflag = 0x0, 
  v_writecount = 0x0, 
  v_freelist = {
    tqe_next = 0x0, 
    tqe_prev = 0x0
  }, 
  v_bufobj = {
    bo_mtx = 0xc403bb1c, 
    bo_clean = {
      bv_hd = {
        tqh_first = 0x0, 
        tqh_last = 0xc403bb64
      }, 
      bv_root = 0x0, 
      bv_cnt = 0x0
    }, 
    bo_dirty = {
      bv_hd = {
        tqh_first = 0x0, 
        tqh_last = 0xc403bb74
      }, 
      bv_root = 0x0, 
      bv_cnt = 0x0
    }, 
    bo_numoutput = 0x0, 
    bo_flag = 0x0, 
    bo_ops = 0xc071608c, 
    bo_bsize = 0x2000, 
    bo_object = 0xc806139c, 
    bo_synclist = {
      le_next = 0x0, 
      le_prev = 0xc3e64b98
    }, 
    bo_private = 0xc403baa0, 
    __bo_vnode = 0xc403baa0
  }, 
  v_pollinfo = 0x0, 
  v_label = 0x0
}



>How-To-Repeat:





>Fix:


None, alas.


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1130221206.0>