Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Feb 2009 14:09:09 +0000
From:      "Edward Fisk" <7ogcg7g02@sneakemail.com>
To:        FreeBSD-gnats-submit@FreeBSD.org
Subject:   kern/132068: page fault when using ZFS over NFS on 7.1-RELEASE/amd64
Message-ID:  <31611-61615@sneakemail.com>
Resent-Message-ID: <200902241440.n1OEe35P056169@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         132068
>Category:       kern
>Synopsis:       page fault when using ZFS over NFS on 7.1-RELEASE/amd64
>Confidential:   no
>Severity:       critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Feb 24 14:40:03 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Edward Fisk
>Release:        FreeBSD 7.1-RELEASE amd64
>Organization:
=09
>Environment:
System: FreeBSD testbox.XXX 7.1-RELEASE FreeBSD 7.1-RELEASE #0: Thu Jan 1 0=
8:58:24 UTC 2009 root@driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC=
 amd64

>Description:

FreeBSD 7.1-RELEASE/amd64, GENERIC kernel, 16GB RAM, 2x dual-core AMD 2.8GH=
z CPUs

Standard install on UFS volume, but with one 500GB ZFS volume, exported
via NFS to various clients (FreeBSD, Debian Linux, Mac OS X).

No ZFS snapshots in use.

After anywhere from a few hours to a few days of use, the system will
either hang or panic.  Running iozone on the clients on the NFS mounted
volume doesn't appear to have any effect one way or the other.

If you require any more information or testing, please let me know.  --Ed

# dmesg
CPU: Dual-Core AMD Opteron(tm) Processor 2220 (2800.12-MHz K8-class CPU)
usable memory =3D 17166274560 (16371 MB)

# zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	tank        ONLINE       0     0     0
	  aacd0s1f  ONLINE       0     0     0

errors: No known data errors

# cat /etc/sysctl.conf
kern.maxvnodes=3D400000

# cat /boot/loader.conf
vm.kmem_size_max=3D"1024M"
vm.kmem_size=3D"1024M"
vfs.zfs.arc_max=3D"100M"

# kgdb kernel.debug /var/crash/vmcore.0
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain condition=
s.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid =3D 1; apic id =3D 01
fault virtual address	=3D 0x1c4
fault code		=3D supervisor read data, page not present
instruction pointer	=3D 0x8:0xffffffff80651673
stack pointer	        =3D 0x10:0xffffffffdd9d6600
frame pointer	        =3D 0x10:0xffffff00341efa38
code segment		=3D base 0x0, limit 0xfffff, type 0x1b
			=3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	=3D interrupt enabled, resume, IOPL =3D 0
current process		=3D 754 (nfsd)
trap number		=3D 12
panic: page fault
cpuid =3D 1
Uptime: 19d19h23m22s
Physical memory: 16371 MB
Dumping 1425 MB: 1410 1394 1378 1362 1346 1330 1314 1298 1282 1266 1250 123=
4 1218 1202 1186 1170 1154 1138 1122 1106 1090 1074 1058 1042 1026 1010 994=
 978 962 946 930 914 898 882 866 850 834 818 802 786 770 754 738 722 706 69=
0 674 658 642 626 610 594 578 562 546 530 514 498 482 466 450 434 418 402 3=
86 370 354 338 322 306 290 274 258 242 226 210 194 178 162 146 130 114 98 8=
2 66 50 34 18 2

Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kerne=
l/zfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /bo=
ot/kernel/opensolaris.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/pf.ko...Reading symbols from /boot/kernel=
/pf.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/pf.ko
#0  doadump () at pcpu.h:195
195		__asm __volatile("movq %%gs:0,%0" : "=3Dr" (td));
(kgdb) bt
#0  doadump () at pcpu.h:195
#1  0x0000000000000004 in ?? ()
#2  0xffffffff804b4ce9 in boot (howto=3D260) at /usr/src/sys/kern/kern_shut=
down.c:418
#3  0xffffffff804b50f2 in panic (fmt=3D0x104 <Address 0x104 out of bounds>)=
 at /usr/src/sys/kern/kern_shutdown.c:574
#4  0xffffffff8078a173 in trap_fatal (frame=3D0xffffff0003793370, eva=3DVar=
iable "eva" is not available.
) at /usr/src/sys/amd64/amd64/trap.c:764
#5  0xffffffff8078a545 in trap_pfault (frame=3D0xffffffffdd9d6550, usermode=
=3D0) at /usr/src/sys/amd64/amd64/trap.c:680
#6  0xffffffff8078ae88 in trap (frame=3D0xffffffffdd9d6550) at /usr/src/sys=
/amd64/amd64/trap.c:449
#7  0xffffffff8077067e in calltrap () at /usr/src/sys/amd64/amd64/exception=
.S:209
#8  0xffffffff80651673 in nfsrv_readdirplus (nfsd=3D0xffffff0041546700, slp=
=3D0xffffff0079a7ad00, td=3D0xffffff0003793370, mrq=3D0xffffffffdd9d6b00) a=
t /usr/src/sys/nfsserver/nfs_serv.c:3645
#9  0xffffffff8065e6dd in nfssvc (td=3DVariable "td" is not available.
) at /usr/src/sys/nfsserver/nfs_syscalls.c:456
#10 0xffffffff8078a7c7 in syscall (frame=3D0xffffffffdd9d6c80) at /usr/src/=
sys/amd64/amd64/trap.c:907
#11 0xffffffff8077088b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exce=
ption.S:330
#12 0x0000000800687d5c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) list *0xffffffff80651673
0xffffffff80651673 is in nfsrv_readdirplus (/usr/src/sys/nfsserver/nfs_serv=
.c:3645).
3640				 */
3641				if (VFS_VGET(vp->v_mount, dp->d_fileno, LK_EXCLUSIVE,
3642				    &nvp))
3643					goto invalid;
3644				bzero((caddr_t)nfhp, NFSX_V3FH);
3645				nfhp->fh_fsid =3D
3646					nvp->v_mount->mnt_stat.f_fsid;
3647				/*
3648				 * XXXRW: Assert the mountpoints are the same so that
3649				 * we know that acquiring Giant based on the
(kgdb) frame 8
#8  0xffffffff80651673 in nfsrv_readdirplus (nfsd=3D0xffffff0041546700, slp=
=3D0xffffff0079a7ad00, td=3D0xffffff0003793370, mrq=3D0xffffffffdd9d6b00) a=
t /usr/src/sys/nfsserver/nfs_serv.c:3645
3645				nfhp->fh_fsid =3D
(kgdb) p *vp
$1 =3D {v_type =3D VDIR, v_tag =3D 0xffffffffdd978d4d "zfs", v_op =3D 0xfff=
fffffdd97bda0, v_data =3D 0xffffff00240d4870, v_mount =3D 0xffffff0003d4f6f=
0, v_nmntvnodes =3D {tqe_next =3D 0xffffff004cb92bd0,=20
    tqe_prev =3D 0xffffff0003d9a610}, v_un =3D {vu_mount =3D 0x0, vu_socket=
 =3D 0x0, vu_cdev =3D 0x0, vu_fifoinfo =3D 0x0, vu_yield =3D 0}, v_hashlist=
 =3D {le_next =3D 0x0, le_prev =3D 0x0}, v_hash =3D 0,=20
  v_cache_src =3D {lh_first =3D 0xffffff003b891148}, v_cache_dst =3D {tqh_f=
irst =3D 0x0, tqh_last =3D 0xffffff0051425a38}, v_dd =3D 0x0, v_cstart =3D =
0, v_lasta =3D 0, v_lastw =3D 0, v_clen =3D 0, v_lock =3D {
    lk_object =3D {lo_name =3D 0xffffffffdd978d4d "zfs", lo_type =3D 0xffff=
ffffdd978d4d "zfs", lo_flags =3D 70844416, lo_witness_data =3D {lod_list =
=3D {stqe_next =3D 0x0}, lod_witness =3D 0x0}},=20
    lk_interlock =3D 0xffffffff80ab32b0, lk_flags =3D 64, lk_sharecount =3D=
 0, lk_waitcount =3D 0, lk_exclusivecount =3D 0, lk_prio =3D 80, lk_timo =
=3D 51, lk_lockholder =3D 0xffffffffffffffff,=20
    lk_newlock =3D 0x0}, v_interlock =3D {lock_object =3D {lo_name =3D 0xff=
ffffff8085132a "vnode interlock", lo_type =3D 0xffffffff8085132a "vnode int=
erlock", lo_flags =3D 16973824,=20
      lo_witness_data =3D {lod_list =3D {stqe_next =3D 0x0}, lod_witness =
=3D 0x0}}, mtx_lock =3D 4, mtx_recurse =3D 0}, v_vnlock =3D 0xffffff0051425=
a70, v_holdcnt =3D 2, v_usecount =3D 1, v_iflag =3D 0,=20
  v_vflag =3D 0, v_writecount =3D 0, v_freelist =3D {tqe_next =3D 0xffffff0=
045f415e8, tqe_prev =3D 0xffffff0045f12f08}, v_bufobj =3D {bo_mtx =3D 0xfff=
fff0051425ac0, bo_clean =3D {bv_hd =3D {
        tqh_first =3D 0x0, tqh_last =3D 0xffffff0051425b30}, bv_root =3D 0x=
0, bv_cnt =3D 0}, bo_dirty =3D {bv_hd =3D {tqh_first =3D 0x0, tqh_last =3D =
0xffffff0051425b50}, bv_root =3D 0x0, bv_cnt =3D 0},=20
    bo_numoutput =3D 0, bo_flag =3D 0, bo_ops =3D 0xffffffff80a336a0, bo_bs=
ize =3D 131072, bo_object =3D 0xffffff00511165b0, bo_synclist =3D {le_next =
=3D 0x0, le_prev =3D 0x0},=20
    bo_private =3D 0xffffff00514259d8, __bo_vnode =3D 0xffffff00514259d8}, =
v_pollinfo =3D 0x0, v_label =3D 0x0, v_lockf =3D 0x0}
(kgdb) p *dp
$3 =3D {d_fileno =3D 69115, d_reclen =3D 52, d_type =3D 8 '\b', d_namlen =
=3D 43 '+',=20
  d_name =3D "1202493068.863_1.mailserver1.XXXXXXXXXXX:2,\000=A8 \002\0008\=
000\b,1234513680.8945_1.mailserver1.XXXXXXXXXXX:2,\000:2,=C1<\001\0008\000\=
b.1223469304.21605_1.mailserver1.XXXXXXXXXXX:2,S\0002\034\005\001\0008\000\=
b,1199341929.2724_2.mailserver1.XXXXXX"...}

>How-To-Repeat:

>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?31611-61615>