From owner-freebsd-fs@FreeBSD.ORG Wed Aug 22 13:11:46 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90DCF106564A for ; Wed, 22 Aug 2012 13:11:46 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id B1BCE8FC18 for ; Wed, 22 Aug 2012 13:11:45 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA23860; Wed, 22 Aug 2012 16:11:33 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <5034DA84.8050507@FreeBSD.org> Date: Wed, 22 Aug 2012 16:11:32 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:14.0) Gecko/20120730 Thunderbird/14.0 MIME-Version: 1.0 To: Roger Hammerstein References: In-Reply-To: X-Enigmail-Version: 1.4.2 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org Subject: Re: panic while zfs scrubbing X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Aug 2012 13:11:46 -0000 on 22/08/2012 00:09 Roger Hammerstein said the following: > > > I have a zpool where scrub seems to cause panics. > > I do not have zfs in rc.conf, but import manually > on boot. > > I start a scrub on a zpool, and some time through will get a panic > and reboot. > After panic and reboot, re-importing the pool and allowing > the scrub to restart on its own will cause another panic. > So I import and immediately stop the scrub for now. > > ls -la *.{9,8,10} > -rw------- 1 root wheel 150744 Aug 21 16:46 core.txt.10 > -rw------- 1 root wheel 147280 Aug 21 11:04 core.txt.8 > -rw------- 1 root wheel 148572 Aug 21 14:53 core.txt.9 > -rw------- 1 root wheel 457 Aug 21 16:45 info.10 > -rw------- 1 root wheel 456 Aug 21 11:04 info.8 > -rw------- 1 root wheel 458 Aug 21 14:52 info.9 > -rw------- 1 root wheel 643919872 Aug 21 16:46 vmcore.10 > -rw------- 1 root wheel 767168512 Aug 21 11:04 vmcore.8 > -rw------- 1 root wheel 1097850880 Aug 21 14:53 vmcore.9 > > > 9.1-BETA1 FreeBSD 9.1-BETA1 #34: Thu Jul 12 05:57:44 EDT 2012 > amd64 > 4GB of ram, 4gb of swap. > > > panic: integer divide fault > > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > > > Fatal trap 18: integer divide fault while in kernel mode > cpuid = 5; apic id = 05 > instruction pointer = 0x20:0xffffffff81674a14 > stack pointer = 0x28:0xffffff810c3d4520 > frame pointer = 0x28:0xffffff810c3d4540 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 9480 (txg_thread_enter) > trap number = 18 > panic: integer divide fault > cpuid = 5 > > KDB: stack backtrace: > #0 0xffffffff80920346 at kdb_backtrace+0x66 > #1 0xffffffff808ea35e at panic+0x1ce > #2 0xffffffff80bd7a30 at trap_fatal+0x290 > #3 0xffffffff80bd80c5 at trap+0x105 > #4 0xffffffff80bc295f at calltrap+0x8 > #5 0xffffffff816818cf at vdev_mirror_io_start+0x2bf > #6 0xffffffff81699542 at zio_vdev_io_start+0x232 > #7 0xffffffff81698fe3 at zio_execute+0xc3 > #8 0xffffffff8165ea1c at dsl_scan_scrub_cb+0x3ec > #9 0xffffffff8165fe14 at dsl_scan_visitbp+0x534 > #10 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 > #11 0xffffffff81660c84 at dsl_scan_visitdnode+0x84 > #12 0xffffffff81660070 at dsl_scan_visitbp+0x790 > #13 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 > #14 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 > #15 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 > #16 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 > #17 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 > Uptime: 1h51m55s > Dumping 614 out of 3818 MB:..3%..11%..21%..32%..42%..53%..63%..71%..81%..92% > > > Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/zfs.ko > Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/opensolaris.ko > #0 doadump (textdump=Variable "textdump" is not available. > ) at pcpu.h:224 > 224 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) #0 doadump (textdump=Variable "textdump" is not available. > ) at pcpu.h:224 > #1 0xffffffff808e9e41 in kern_reboot (howto=260) > at /usr/src/sys/kern/kern_shutdown.c:448 > #2 0xffffffff808ea337 in panic (fmt=0x1
) > at /usr/src/sys/kern/kern_shutdown.c:636 > #3 0xffffffff80bd7a30 in trap_fatal (frame=0x12, eva=Variable "eva" is not available. > ) > at /usr/src/sys/amd64/amd64/trap.c:857 > #4 0xffffffff80bd80c5 in trap (frame=0xffffff810c3d4470) > at /usr/src/sys/amd64/amd64/trap.c:599 > #5 0xffffffff80bc295f in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:228 > #6 0xffffffff81674a14 in spa_get_random (range=0) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:1165 Not sure what triggers this problem but it looks like zio is issued for a block-pointer with no valid DVA. It's either a result of some logical bug in ZFS code or some severe on-disk corruption. > #7 0xffffffff816818cf in vdev_mirror_io_start (zio=0xfffffe0037e5e000) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:89 Could you please print *zio and *zio->io_bp in this frame? It might also be good idea to report this issue to zfs-discuss@opensolaris.org. > #8 0xffffffff81699542 in zio_vdev_io_start (zio=0xfffffe0037e5e000) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:2305 > #9 0xffffffff81698fe3 in zio_execute (zio=0xfffffe0037e5e000) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1196 > #10 0xffffffff8165ea1c in dsl_scan_scrub_cb (dp=0xffffff810c3d4538, > bp=0xffffff8003c53480, zb=0xffffff810c3d4970) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:1737 And *bp and *scn here too. > #11 0xffffffff8165fe14 in dsl_scan_visitbp (bp=0xffffff8003c53480, > zb=0xffffff810c3d4970, dnp=0xffffff8003642200, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:858 > #12 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff8003642240, > zb=0xffffff810c3d4a00, dnp=0xffffff8003642200, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684 > #13 0xffffffff81660c84 in dsl_scan_visitdnode (scn=0xfffffe001523dc00, > ds=0xfffffe0037abf400, ostype=DMU_OST_ZFS, dnp=0xffffff8003642200, > buf=0xfffffe00befda9c0, object=291417, tx=0xfffffe00151fc400) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:770 > #14 0xffffffff81660070 in dsl_scan_visitbp (bp=0xffffff800359b900, > zb=0xffffff810c3d4cb0, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:718 > #15 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff80033e5380, > zb=0xffffff810c3d4e10, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684 > #16 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff80033df000, > zb=0xffffff810c3d4f70, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684 > #17 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff80033db000, > zb=0xffffff810c3d50d0, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684 > #18 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff8003451000, > zb=0xffffff810c3d5230, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684 > #19 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xffffff80033d7000, > zb=0xffffff810c3d5390, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684 > #20 0xffffffff8165fd99 in dsl_scan_visitbp (bp=0xfffffe0008076040, > zb=0xffffff810c3d5420, dnp=0xfffffe0008076000, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:684 > #21 0xffffffff81660c84 in dsl_scan_visitdnode (scn=0xfffffe001523dc00, > ds=0xfffffe0037abf400, ostype=DMU_OST_ZFS, dnp=0xfffffe0008076000, > buf=0xfffffe00375996e8, object=0, tx=0xfffffe00151fc400) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:770 > #22 0xffffffff8165ff9a in dsl_scan_visitbp (bp=0xfffffe003729e280, > zb=0xffffff810c3d55f0, dnp=0x0, pbuf=Variable "pbuf" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:736 > #23 0xffffffff816600d7 in dsl_scan_visit_rootbp (scn=Variable "scn" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:872 > #24 0xffffffff81660172 in dsl_scan_visitds (scn=0xfffffe001523dc00, dsobj=21, > tx=0xfffffe00151fc400) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:1099 > #25 0xffffffff81660695 in dsl_scan_sync (dp=0xfffffe0037335000, > tx=0xfffffe00151fc400) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c:1355 > #26 0xffffffff81667e30 in spa_sync (spa=0xfffffe0008161000, txg=97010) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:5711 > #27 0xffffffff81678749 in txg_sync_thread (arg=Variable "arg" is not available. > ) > at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c:423 > #28 0xffffffff808bb4cf in fork_exit ( > callout=0xffffffff81678610 , arg=0xfffffe0037335000, > frame=0xffffff810c3d5c40) at /usr/src/sys/kern/kern_fork.c:992 > #29 0xffffffff80bc2e8e in fork_trampoline () > at /usr/src/sys/amd64/amd64/exception.S:602 [snip] -- Andriy Gapon