From owner-freebsd-fs@FreeBSD.ORG Tue Nov 27 20:32:30 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 430DC714 for ; Tue, 27 Nov 2012 20:32:30 +0000 (UTC) (envelope-from josh@signalboxes.net) Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id E152F8FC15 for ; Tue, 27 Nov 2012 20:32:29 +0000 (UTC) Received: by mail-ob0-f182.google.com with SMTP id 16so14634382obc.13 for ; Tue, 27 Nov 2012 12:32:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=E9Ooz3AMoQydQUUD+PQF6sAY4qPAIEbSum2rYxWUeB4=; b=CKjbTiW/8yZBUIB3ONE6VOShCjREz3WvK1tvrPnSkkmGjB7ooFDsnDvT3YHmb52eNi E7dsJ7mynhpUPEMaKClPO7P52WDdPux+KYoOREAy9525tRduxW3hXHrW9a1H+3MEZzy6 LScv0ea+EDr0AwKuutVgd6Fbh9UZlTslwgRGlk40A0Cu9i0uLbscugQ8UaDumLzDBkwl jeXimsefxzbZWCNDquXyKB1VlY6dSiHD3h8DyuMyrcf7VJoAh4DMX+lHEHU3eg03/hMQ jMZTgEQYlNjEmqVZEdsSAMvIGjkbiDWbpJnBuh1giTGDIqivdWTcUNMj7yhGgmJgQb/v 2iGw== Received: by 10.182.78.228 with SMTP id e4mr893672obx.77.1354048348254; Tue, 27 Nov 2012 12:32:28 -0800 (PST) Received: from mail-oa0-f54.google.com (mail-oa0-f54.google.com [209.85.219.54]) by mx.google.com with ESMTPS id m3sm15818148obm.21.2012.11.27.12.32.26 (version=SSLv3 cipher=OTHER); Tue, 27 Nov 2012 12:32:27 -0800 (PST) Received: by mail-oa0-f54.google.com with SMTP id n9so16520628oag.13 for ; Tue, 27 Nov 2012 12:32:26 -0800 (PST) MIME-Version: 1.0 Received: by 10.60.169.171 with SMTP id af11mr9739803oec.92.1354048346084; Tue, 27 Nov 2012 12:32:26 -0800 (PST) Received: by 10.60.14.194 with HTTP; Tue, 27 Nov 2012 12:32:25 -0800 (PST) In-Reply-To: <50B50B04.8020109@FreeBSD.org> References: <50B50B04.8020109@FreeBSD.org> Date: Tue, 27 Nov 2012 13:32:25 -0700 Message-ID: Subject: Re: ZFS: Panic when attempting to delete certain data From: Josh Beard To: Andriy Gapon X-Gm-Message-State: ALoCoQkhcZM8cfi/cbefpXwmyfcvsjgm3Cqyw4D6IUzdKirZvBKJilZ4/SqdYV0az4w1VCmW2L2i Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2012 20:32:30 -0000 On Tue, Nov 27, 2012 at 11:48 AM, Andriy Gapon wrote: > on 27/11/2012 20:25 Josh Beard said the following: > > Hello, > > > > I have a system that I can consistently reproduce a panic on when trying > to > > delete certain data. The data is data that was rsynced from another > system > > - nothing terribly unique. This has been ongoing from several months, > > starting with 9.0-RELEASE and now running 9.1-RC3. > > > > I can't find anything in common with the files that I can trigger the > > panics with. One is a simple gzipped archive where some are plain text. > > Strangely, I can only reproduce it with data that was rsynced from that > > particular system (which is a Mac). > > Josh, > > I am collecting these cases, thank you for another one. > I had an interesting investigation of > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/173747 > Unfortunately, for some reason the whole conversation stayed private. > I see that also opened a PR earlier: > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/170238 > > Could you please provide the following info? > From kgdb: > - list in frame 7 (zfs_freebsd_remove), so that I can see the code line > - local variables from frame 7 (info local) > > > Andriy, Thanks for your quick response. I've never used kgdb, so forgive my ignorance here. Is this what you're looking for? If not, if you could you elaborate on those? #7 0xffffffff80ebd45a in zfs_freebsd_remove (ap=Variable "ap" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1855 1855 dmu_tx_hold_sa(tx, xzp->z_sa_hdl, B_FALSE); (kgdb) list zfs_freebsd_remove 5796 struct vop_remove_args /* { 5797 struct vnode *a_dvp; 5798 struct vnode *a_vp; 5799 struct componentname *a_cnp; 5800 } */ *ap; 5801 { 5802 5803 ASSERT(ap->a_cnp->cn_flags & SAVENAME); 5804 5805 return (zfs_remove(ap->a_dvp, ap->a_cnp->cn_nameptr, (kgdb) info frame 7 Stack frame at 0xffffff8466a6a920: rip = 0xffffffff80ebd45a in zfs_freebsd_remove (/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1855); saved rip 0xffffffff8081cf13 called by frame at 0xffffff8466a6a940, caller of frame at 0xffffff8466a6a7a0 source language c. Arglist at 0xffffff8466a6a910, args: ap=Variable "ap" is not available. Also, for one of the files that trigger the problem: > - ls -i to obtain its inode number > - zdb -ddddd > # ls -i kyofilter\ v2.2.pax.gz (this is a symlink. the file that it's linked to does *not* panic the system if I try to delete it). 247126 kyofilter v2.2.pax.gz # zdb -ddddd store/tdxs1 247126 Dataset store/tdxs1 [ZPL], ID 109, cr_txg 35014, 1.33T, 1106389 objects, rootbp DVA[0]=<0:80001a2400:400> DVA[1]=<0:30800610000:400> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1166838L/1166838P fill=1106389 cksum=19391f0f67:78eb24a9cca:1439005549d01:275015332d1bdf Object lvl iblk dblk dsize lsize %full type 247126 1 16K 512 0 512 0.00 ZFS plain file 201 bonus System attributes dnode flags: USERUSED_ACCOUNTED dnode maxblkid: 0 path /tech/2012-09-14-01-00/Drivers/Kyocera/.old/C2126.old/Kyocera OS X 10.5+ Web build 2011.01.27.mpkg/Contents/Packages/Kyocera OS X subinstaller.mpkg/Contents/Packages/kyofilter v2.2.pkg/Contents/Resources/kyofilter v2.2.pax.gz uid 1001 gid 80 atime Tue Nov 27 13:27:57 2012 mtime Tue Jul 12 14:17:16 2011 ctime Fri Sep 14 01:05:23 2012 crtime Fri Sep 14 01:04:11 2012 gen 81338 mode 120755 size 17 parent 247122 links 1 pflags 40800000104 xattr 155 Indirect blocks: Thank you. > > > I seriously doubt it's hardware at this point, as virtually every piece > of > > hardware in that system has been replaced (including motherboard and > > drives). That said, the zpools were rebuilt from scratch when the drives > > were replaced and the issue persists. > > > > I can't seem to trigger it with other actions, such as chmod, chown, or > > even mv. Simply attempting to unlink the files seems to do it. > > > > # uname -a (I can reproduce on a GENERIC kernel, too). > > FreeBSD bksys1 9.1-RC3 FreeBSD 9.1-RC3 #0 r242591: Sun Nov 4 19:17:25 > MST > > 2012 root@bksys1:/usr/obj/usr/src/sys/BKSYS191 amd64 > > > > zpool version is 28; zfs version is 5. > > > > /boot/loader.conf doesn't have anything related in it, and an empty one > > produces the same results. > > > > zpool scrubs are done weekly and have returned no errors (most recent > was 3 > > days ago). > > > > Any insight is very appreciated! > > > > Josh > > > > > > The message: > > Fatal trap 12: page fault while in kernel mode > > cpuid = 3; apic id = 05 > > fault virtual address = 0x160 > > fault code = supervisor read data, page not present > > instruction pointer = 0x20:0xffffffff80ebd45a > > stack pointer = 0x28:0xffffff8466534850 > > frame pointer = 0x28:0xffffff8466534910 > > code segment = base 0x0, limit 0xfffff, type 0x1b > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags = interrupt enabled, resume, IOPL = 0 > > current process = 3245 (rm) > > trap number = 12 > > panic: page fault > > cpuid = 3 > > KDB: stack backtrace: > > #0 0xffffffff80585c28 at kdb_backtrace+0x68 > > #1 0xffffffff805502cb at panic+0x21b > > #2 0xffffffff807a9fad at trap_fatal+0x39d > > #3 0xffffffff807aa0f0 at trap_pfault+0x120 > > #4 0xffffffff807aa7e9 at trap+0x3d9 > > #5 0xffffffff80794f4f at calltrap+0x8 > > #6 0xffffffff8081cf13 at VOP_REMOVE_APV+0x53 > > #7 0xffffffff805ed355 at kern_unlinkat+0x265 > > #8 0xffffffff805ed419 at kern_unlink+0x19 > > #9 0xffffffff805ed431 at sys_unlink+0x11 > > #10 0xffffffff807a95bd at amd64_syscall+0x2fd > > #11 0xffffffff80795237 at Xfast_syscall+0xf7 > > Uptime: 14m42s > > Dumping 2432 out of 16361 > > MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > > > Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from > > /boot/kernel/coretemp.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/coretemp.ko > > Reading symbols from /boot/kernel/zfs.ko...Reading symbols from > > /boot/kernel/zfs.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/zfs.ko > > Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from > > /boot/kernel/opensolaris.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/opensolaris.ko > > Reading symbols from /boot/kernel/if_lagg.ko...Reading symbols from > > /boot/kernel/if_lagg.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/if_lagg.ko > > Reading symbols from /boot/kernel/ng_ubt.ko...Reading symbols from > > /boot/kernel/ng_ubt.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/ng_ubt.ko > > Reading symbols from /boot/kernel/ng_hci.ko...Reading symbols from > > /boot/kernel/ng_hci.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/ng_hci.ko > > Reading symbols from /boot/kernel/ng_bluetooth.ko...Reading symbols from > > /boot/kernel/ng_bluetooth.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/ng_bluetooth.ko > > Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from > > /boot/kernel/netgraph.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/netgraph.ko > > Reading symbols from /boot/kernel/ng_l2cap.ko...Reading symbols from > > /boot/kernel/ng_l2cap.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/ng_l2cap.ko > > Reading symbols from /boot/kernel/ng_btsocket.ko...Reading symbols from > > /boot/kernel/ng_btsocket.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/ng_btsocket.ko > > Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from > > /boot/kernel/ng_socket.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/ng_socket.ko > > Reading symbols from /boot/kernel/blank_saver.ko...Reading symbols from > > /boot/kernel/blank_saver.ko.symbols...done. > > done. > > Loaded symbols for /boot/kernel/blank_saver.ko > > #0 doadump (textdump=Variable "textdump" is not available. > > ) at pcpu.h:224 > > 224 pcpu.h: No such file or directory. > > in pcpu.h > > (kgdb) #0 doadump (textdump=Variable "textdump" is not available. > > ) at pcpu.h:224 > > #1 0xffffffff8054ff87 in kern_reboot (howto=260) > > at /usr/src/sys/kern/kern_shutdown.c:448 > > #2 0xffffffff8055030f in panic (fmt=Variable "fmt" is not available. > > ) > > at /usr/src/sys/kern/kern_shutdown.c:636 > > #3 0xffffffff807a9fad in trap_fatal (frame=0xffffff84665347a0, eva=352) > > at /usr/src/sys/amd64/amd64/trap.c:857 > > #4 0xffffffff807aa0f0 in trap_pfault (frame=0xffffff84665347a0, > usermode=0) > > at /usr/src/sys/amd64/amd64/trap.c:714 > > #5 0xffffffff807aa7e9 in trap (frame=0xffffff84665347a0) > > at /usr/src/sys/amd64/amd64/trap.c:456 > > #6 0xffffffff80794f4f in calltrap () > > at /usr/src/sys/amd64/amd64/exception.S:228 > > #7 0xffffffff80ebd45a in zfs_freebsd_remove (ap=Variable "ap" is not > > available. > > ) > > at > > > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1855 > > #8 0xffffffff8081cf13 in VOP_REMOVE_APV (vop=Variable "vop" is not > > available. > > ) at vnode_if.c:1333 > > #9 0xffffffff805ed355 in kern_unlinkat (td=0xfffffe000c4b1000, fd=-100, > > path=0x7fffffffdb2e
, > > pathseg=UIO_USERSPACE, oldinum=0) at vnode_if.h:575 > > #10 0xffffffff805ed419 in kern_unlink (td=Variable "td" is not available. > > ) > > at /usr/src/sys/kern/vfs_syscalls.c:1897 > > #11 0xffffffff805ed431 in sys_unlink (td=Variable "td" is not available. > > ) > > at /usr/src/sys/kern/vfs_syscalls.c:1867 > > #12 0xffffffff807a95bd in amd64_syscall (td=0xfffffe000c4b1000, traced=0) > > at subr_syscall.c:135 > > #13 0xffffffff80795237 in Xfast_syscall () > > at /usr/src/sys/amd64/amd64/exception.S:387 > > #14 0x00000008009100bc in ?? () > > Previous frame inner to this frame (corrupt stack?) > > (kgdb) > [snip] > > -- > Andriy Gapon >