From owner-freebsd-bugs@FreeBSD.ORG Thu Oct 2 21:15:42 2008 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 29A4B1065689 for ; Thu, 2 Oct 2008 21:15:42 +0000 (UTC) (envelope-from vwe@freebsd.org) Received: from frontmail.ipactive.de (frontmail.maindns.de [85.214.95.103]) by mx1.freebsd.org (Postfix) with ESMTP id A6A588FC1F for ; Thu, 2 Oct 2008 21:15:41 +0000 (UTC) (envelope-from vwe@freebsd.org) Received: from mail.vtec.ipme.de (Q7c1c.q.ppp-pool.de [89.53.124.28]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by frontmail.ipactive.de (Postfix) with ESMTP id 9956312883F for ; Thu, 2 Oct 2008 22:58:07 +0200 (CEST) Received: from cesar.sz.vwsoft.com (cesar.sz.vwsoft.com [192.168.16.3]) by mail.vtec.ipme.de (Postfix) with ESMTP id C042A2E90F; Thu, 2 Oct 2008 22:56:52 +0200 (CEST) Message-ID: <48E535D8.4030101@freebsd.org> Date: Thu, 02 Oct 2008 22:58:00 +0200 From: Volker Werth User-Agent: Thunderbird 2.0.0.17 (X11/20080930) MIME-Version: 1.0 To: Weldon Godfrey References: <200810012106.m91L6jq2007417@freefall.freebsd.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit MailScanner-NULL-Check: 1223585815.83854@mcgGsZ3raD/DFoqguOK14A X-MailScanner-ID: C042A2E90F.E03FD X-VWSoft-MailScanner: Found to be clean X-MailScanner-From: vwe@freebsd.org X-ipactive-MailScanner-Information: Please contact the ISP for more information X-ipactive-MailScanner: Found to be clean X-ipactive-MailScanner-From: vwe@freebsd.org Cc: freebsd-bugs@FreeBSD.org Subject: Re: kern/125149: [zfs][nfs] changing into .zfs dir from nfs client causes endless panic loop X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Oct 2008 21:15:42 -0000 On 10/02/08 21:05, Weldon Godfrey wrote: > Yes, I can replicate statting .zfs dir from NFS client causes FreeBSD to > panic and reboot, this time from CentOS 5.0 box. ... > > > Replicate: > > [root@asmtp2 ~]# df > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/mapper/VolGroup00-LogVol00 > 60817412 2814548 54863692 5% / > /dev/sda1 101086 28729 67138 30% /boot > tmpfs 2008628 0 2008628 0% /dev/shm > 192.168.2.22:/vol/enamail > 1286702144 1032758816 253943328 81% > /var/spool/mail > 192.168.2.21:/vol/exports/gaggle > 400959408 144327584 256631824 36% > /var/spool/mail/archive/gaggle > 192.168.2.36:/export/store1-1 > 1413955712 4619136 1409336576 1% > /var/spool/mail/store1-1 > [root@asmtp2 ~]# > [root@asmtp2 ~]# > [root@asmtp2 ~]# cd /var/spool/mail/store1-1 > [root@asmtp2 store1-1]# ls > 1 2 3 4 5 6 7 8 9 crap > [root@asmtp2 store1-1]# cd .zfs > [root@asmtp2 .zfs]# ls > (FreeBSD ZFS server panics here) > > Weldon > > Backtrace: > > store1# kgdb /usr/obj/usr/src/sys/GENERIC/kernel.debug vmcore.27 > [GDB will not be able to debug user-mode threads: > /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you > are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for > details. > This GDB was configured as "amd64-marcel-freebsd". > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > cpuid = 5; apic id = 05 > fault virtual address = 0x108 > fault code = supervisor write data, page not present > instruction pointer = 0x8:0xffffffff804f06fa > stack pointer = 0x10:0xffffffffdf761590 > frame pointer = 0x10:0x4 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 807 (nfsd) > trap number = 12 > panic: page fault > cpuid = 5 > Uptime: 1m19s > Physical memory: 16367 MB > Dumping 891 MB: 876 860 844 828 812 796 780 764 748 732 716 700 684 668 > 652 636 620 604 588 572 556 540 524 508 492 476 460 444 428 412 396 380 > 364 348 332 316 300 284 268 252 236 220 204 188 172 156 140 124 108 92 > 76 60 44 28 12 > > #0 doadump () at pcpu.h:194 > 194 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) vt > Undefined command: "vt". Try "help". > (kgdb) bt > #0 doadump () at pcpu.h:194 > #1 0x0000000000000004 in ?? () > #2 0xffffffff80477699 in boot (howto=260) at > /usr/src/sys/kern/kern_shutdown.c:409 > #3 0xffffffff80477a9d in panic (fmt=0x104
bounds>) at /usr/src/sys/kern/kern_shutdown.c:563 > #4 0xffffffff8072ed24 in trap_fatal (frame=0xffffff00059a0340, > eva=18446742974291977320) > at /usr/src/sys/amd64/amd64/trap.c:724 > #5 0xffffffff8072f0f5 in trap_pfault (frame=0xffffffffdf7614e0, > usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 > #6 0xffffffff8072fa38 in trap (frame=0xffffffffdf7614e0) at > /usr/src/sys/amd64/amd64/trap.c:410 > #7 0xffffffff807156ae in calltrap () at > /usr/src/sys/amd64/amd64/exception.S:169 > #8 0xffffffff804f06fa in vput (vp=0x0) at atomic.h:142 > #9 0xffffffff8060670d in nfsrv_readdirplus (nfsd=0xffffff000584f100, > slp=0xffffff0005725900, > td=0xffffff00059a0340, mrq=0xffffffffdf761af0) at > /usr/src/sys/nfsserver/nfs_serv.c:3613 > #10 0xffffffff80615a5d in nfssvc (td=Variable "td" is not available. > ) at /usr/src/sys/nfsserver/nfs_syscalls.c:461 > #11 0xffffffff8072f377 in syscall (frame=0xffffffffdf761c70) at > /usr/src/sys/amd64/amd64/trap.c:852 > #12 0xffffffff807158bb in Xfast_syscall () at > /usr/src/sys/amd64/amd64/exception.S:290 > #13 0x000000080068746c in ?? () > Previous frame inner to this frame (corrupt stack?) > > Weldon, can you please try the following from kgdb and send the output: (kgdb) frame 9 (kgdb) list (kgdb) p *vp (kgdb) p *dp (kgdb) frame 8 (kgdb) list Please keep the core dump as we might need to check some variable values later. I think the problem is the NULL pointer to vput. A maintainer needs to check how nvp can get a NULL pointer (judging by assuming my fresh codebase is not too different from yours). Thanks Volker