Date: Tue, 27 Nov 2012 20:48:36 +0200 From: Andriy Gapon <avg@FreeBSD.org> To: Josh Beard <josh@signalboxes.net> Cc: freebsd-fs@FreeBSD.org Subject: Re: ZFS: Panic when attempting to delete certain data Message-ID: <50B50B04.8020109@FreeBSD.org> In-Reply-To: <CAHDrHStcfSJ-9ueSV%2BFujEsmAK3zMX2CAGVD6Xz_2gJAThu5Kg@mail.gmail.com> References: <CAHDrHStcfSJ-9ueSV%2BFujEsmAK3zMX2CAGVD6Xz_2gJAThu5Kg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
on 27/11/2012 20:25 Josh Beard said the following: > Hello, > > I have a system that I can consistently reproduce a panic on when trying to > delete certain data. The data is data that was rsynced from another system > - nothing terribly unique. This has been ongoing from several months, > starting with 9.0-RELEASE and now running 9.1-RC3. > > I can't find anything in common with the files that I can trigger the > panics with. One is a simple gzipped archive where some are plain text. > Strangely, I can only reproduce it with data that was rsynced from that > particular system (which is a Mac). Josh, I am collecting these cases, thank you for another one. I had an interesting investigation of http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/173747 Unfortunately, for some reason the whole conversation stayed private. I see that also opened a PR earlier: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/170238 Could you please provide the following info? >From kgdb: - list in frame 7 (zfs_freebsd_remove), so that I can see the code line - local variables from frame 7 (info local) Also, for one of the files that trigger the problem: - ls -i to obtain its inode number - zdb -ddddd <dataset name> <inode number> Thank you. > I seriously doubt it's hardware at this point, as virtually every piece of > hardware in that system has been replaced (including motherboard and > drives). That said, the zpools were rebuilt from scratch when the drives > were replaced and the issue persists. > > I can't seem to trigger it with other actions, such as chmod, chown, or > even mv. Simply attempting to unlink the files seems to do it. > > # uname -a (I can reproduce on a GENERIC kernel, too). > FreeBSD bksys1 9.1-RC3 FreeBSD 9.1-RC3 #0 r242591: Sun Nov 4 19:17:25 MST > 2012 root@bksys1:/usr/obj/usr/src/sys/BKSYS191 amd64 > > zpool version is 28; zfs version is 5. > > /boot/loader.conf doesn't have anything related in it, and an empty one > produces the same results. > > zpool scrubs are done weekly and have returned no errors (most recent was 3 > days ago). > > Any insight is very appreciated! > > Josh > > > The message: > Fatal trap 12: page fault while in kernel mode > cpuid = 3; apic id = 05 > fault virtual address = 0x160 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff80ebd45a > stack pointer = 0x28:0xffffff8466534850 > frame pointer = 0x28:0xffffff8466534910 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 3245 (rm) > trap number = 12 > panic: page fault > cpuid = 3 > KDB: stack backtrace: > #0 0xffffffff80585c28 at kdb_backtrace+0x68 > #1 0xffffffff805502cb at panic+0x21b > #2 0xffffffff807a9fad at trap_fatal+0x39d > #3 0xffffffff807aa0f0 at trap_pfault+0x120 > #4 0xffffffff807aa7e9 at trap+0x3d9 > #5 0xffffffff80794f4f at calltrap+0x8 > #6 0xffffffff8081cf13 at VOP_REMOVE_APV+0x53 > #7 0xffffffff805ed355 at kern_unlinkat+0x265 > #8 0xffffffff805ed419 at kern_unlink+0x19 > #9 0xffffffff805ed431 at sys_unlink+0x11 > #10 0xffffffff807a95bd at amd64_syscall+0x2fd > #11 0xffffffff80795237 at Xfast_syscall+0xf7 > Uptime: 14m42s > Dumping 2432 out of 16361 > MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from > /boot/kernel/coretemp.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/coretemp.ko > Reading symbols from /boot/kernel/zfs.ko...Reading symbols from > /boot/kernel/zfs.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/zfs.ko > Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from > /boot/kernel/opensolaris.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/opensolaris.ko > Reading symbols from /boot/kernel/if_lagg.ko...Reading symbols from > /boot/kernel/if_lagg.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/if_lagg.ko > Reading symbols from /boot/kernel/ng_ubt.ko...Reading symbols from > /boot/kernel/ng_ubt.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/ng_ubt.ko > Reading symbols from /boot/kernel/ng_hci.ko...Reading symbols from > /boot/kernel/ng_hci.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/ng_hci.ko > Reading symbols from /boot/kernel/ng_bluetooth.ko...Reading symbols from > /boot/kernel/ng_bluetooth.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/ng_bluetooth.ko > Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from > /boot/kernel/netgraph.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/netgraph.ko > Reading symbols from /boot/kernel/ng_l2cap.ko...Reading symbols from > /boot/kernel/ng_l2cap.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/ng_l2cap.ko > Reading symbols from /boot/kernel/ng_btsocket.ko...Reading symbols from > /boot/kernel/ng_btsocket.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/ng_btsocket.ko > Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from > /boot/kernel/ng_socket.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/ng_socket.ko > Reading symbols from /boot/kernel/blank_saver.ko...Reading symbols from > /boot/kernel/blank_saver.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/blank_saver.ko > #0 doadump (textdump=Variable "textdump" is not available. > ) at pcpu.h:224 > 224 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) #0 doadump (textdump=Variable "textdump" is not available. > ) at pcpu.h:224 > #1 0xffffffff8054ff87 in kern_reboot (howto=260) > at /usr/src/sys/kern/kern_shutdown.c:448 > #2 0xffffffff8055030f in panic (fmt=Variable "fmt" is not available. > ) > at /usr/src/sys/kern/kern_shutdown.c:636 > #3 0xffffffff807a9fad in trap_fatal (frame=0xffffff84665347a0, eva=352) > at /usr/src/sys/amd64/amd64/trap.c:857 > #4 0xffffffff807aa0f0 in trap_pfault (frame=0xffffff84665347a0, usermode=0) > at /usr/src/sys/amd64/amd64/trap.c:714 > #5 0xffffffff807aa7e9 in trap (frame=0xffffff84665347a0) > at /usr/src/sys/amd64/amd64/trap.c:456 > #6 0xffffffff80794f4f in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:228 > #7 0xffffffff80ebd45a in zfs_freebsd_remove (ap=Variable "ap" is not > available. > ) > at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1855 > #8 0xffffffff8081cf13 in VOP_REMOVE_APV (vop=Variable "vop" is not > available. > ) at vnode_if.c:1333 > #9 0xffffffff805ed355 in kern_unlinkat (td=0xfffffe000c4b1000, fd=-100, > path=0x7fffffffdb2e <Address 0x7fffffffdb2e out of bounds>, > pathseg=UIO_USERSPACE, oldinum=0) at vnode_if.h:575 > #10 0xffffffff805ed419 in kern_unlink (td=Variable "td" is not available. > ) > at /usr/src/sys/kern/vfs_syscalls.c:1897 > #11 0xffffffff805ed431 in sys_unlink (td=Variable "td" is not available. > ) > at /usr/src/sys/kern/vfs_syscalls.c:1867 > #12 0xffffffff807a95bd in amd64_syscall (td=0xfffffe000c4b1000, traced=0) > at subr_syscall.c:135 > #13 0xffffffff80795237 in Xfast_syscall () > at /usr/src/sys/amd64/amd64/exception.S:387 > #14 0x00000008009100bc in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) [snip] -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50B50B04.8020109>