Date: Tue, 21 Aug 2012 17:09:28 -0400 From: Roger Hammerstein <cheeky.m@live.com> To: <freebsd-fs@freebsd.org> Subject: panic while zfs scrubbing Message-ID: <BAY170-W8668C02B4DAF69B54EE657F9B80@phx.gbl>
next in thread | raw e-mail | index | archive | help
I have a zpool where scrub seems to cause panics. I do not have zfs in rc.conf=2C but import manually on boot. I start a scrub on a zpool=2C and some time through will get a panic and reboot. After panic and reboot=2C re-importing the pool and allowing the scrub to restart on its own will cause another panic. So I import and immediately stop the scrub for now. ls -la *.{9=2C8=2C10} -rw------- 1 root wheel 150744 Aug 21 16:46 core.txt.10 -rw------- 1 root wheel 147280 Aug 21 11:04 core.txt.8 -rw------- 1 root wheel 148572 Aug 21 14:53 core.txt.9 -rw------- 1 root wheel 457 Aug 21 16:45 info.10 -rw------- 1 root wheel 456 Aug 21 11:04 info.8 -rw------- 1 root wheel 458 Aug 21 14:52 info.9 -rw------- 1 root wheel 643919872 Aug 21 16:46 vmcore.10 -rw------- 1 root wheel 767168512 Aug 21 11:04 vmcore.8 -rw------- 1 root wheel 1097850880 Aug 21 14:53 vmcore.9 9.1-BETA1 FreeBSD 9.1-BETA1 #34: Thu Jul 12 05:57:44 EDT 2012 amd64 4GB of ram=2C 4gb of swap. panic: integer divide fault GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation=2C Inc. GDB is free software=2C covered by the GNU General Public License=2C and yo= u are welcome to change it and/or distribute copies of it under certain condition= s. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 18: integer divide fault while in kernel mode cpuid =3D 5=3B apic id =3D 05 instruction pointer =3D 0x20:0xffffffff81674a14 stack pointer =3D 0x28:0xffffff810c3d4520 frame pointer =3D 0x28:0xffffff810c3d4540 code segment =3D base 0x0=2C limit 0xfffff=2C type 0x1b =3D DPL 0=2C pres 1=2C long 1=2C def32 0=2C gran 1 processor eflags =3D interrupt enabled=2C resume=2C IOPL =3D 0 current process =3D 9480 (txg_thread_enter) trap number =3D 18 panic: integer divide fault cpuid =3D 5 KDB: stack backtrace: #0 0xffffffff80920346 at kdb_backtrace+0x66 #1 0xffffffff808ea35e at panic+0x1ce #2 0xffffffff80bd7a30 at trap_fatal+0x290 #3 0xffffffff80bd80c5 at trap+0x105 #4 0xffffffff80bc295f at calltrap+0x8 #5 0xffffffff816818cf at vdev_mirror_io_start+0x2bf #6 0xffffffff81699542 at zio_vdev_io_start+0x232 #7 0xffffffff81698fe3 at zio_execute+0xc3 #8 0xffffffff8165ea1c at dsl_scan_scrub_cb+0x3ec #9 0xffffffff8165fe14 at dsl_scan_visitbp+0x534 #10 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 #11 0xffffffff81660c84 at dsl_scan_visitdnode+0x84 #12 0xffffffff81660070 at dsl_scan_visitbp+0x790 #13 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 #14 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 #15 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 #16 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 #17 0xffffffff8165fd99 at dsl_scan_visitbp+0x4b9 Uptime: 1h51m55s Dumping 614 out of 3818 MB:..3%..11%..21%..32%..42%..53%..63%..71%..81%..92= % Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kerne= l/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /bo= ot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko #0 doadump (textdump=3DVariable "textdump" is not available. ) at pcpu.h:224 224 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=3DVariable "textdump" is not available. ) at pcpu.h:224 #1 0xffffffff808e9e41 in kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:448 #2 0xffffffff808ea337 in panic (fmt=3D0x1 <Address 0x1 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:636 #3 0xffffffff80bd7a30 in trap_fatal (frame=3D0x12=2C eva=3DVariable "eva" = is not available. ) at /usr/src/sys/amd64/amd64/trap.c:857 #4 0xffffffff80bd80c5 in trap (frame=3D0xffffff810c3d4470) at /usr/src/sys/amd64/amd64/trap.c:599 #5 0xffffffff80bc295f in calltrap () at /usr/src/sys/amd64/amd64/exception.S:228 #6 0xffffffff81674a14 in spa_get_random (range=3D0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/spa_misc.c:1165 #7 0xffffffff816818cf in vdev_mirror_io_start (zio=3D0xfffffe0037e5e000) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/vdev_mirror.c:89 #8 0xffffffff81699542 in zio_vdev_io_start (zio=3D0xfffffe0037e5e000) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/zio.c:2305 #9 0xffffffff81698fe3 in zio_execute (zio=3D0xfffffe0037e5e000) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/zio.c:1196 #10 0xffffffff8165ea1c in dsl_scan_scrub_cb (dp=3D0xffffff810c3d4538=2C=20 bp=3D0xffffff8003c53480=2C zb=3D0xffffff810c3d4970) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:1737 #11 0xffffffff8165fe14 in dsl_scan_visitbp (bp=3D0xffffff8003c53480=2C=20 zb=3D0xffffff810c3d4970=2C dnp=3D0xffffff8003642200=2C pbuf=3DVariable = "pbuf" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/= zfs/dsl_scan.c:858 #12 0xffffffff8165fd99 in dsl_scan_visitbp (bp=3D0xffffff8003642240=2C=20 zb=3D0xffffff810c3d4a00=2C dnp=3D0xffffff8003642200=2C pbuf=3DVariable = "pbuf" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:684 #13 0xffffffff81660c84 in dsl_scan_visitdnode (scn=3D0xfffffe001523dc00=2C= =20 ds=3D0xfffffe0037abf400=2C ostype=3DDMU_OST_ZFS=2C dnp=3D0xffffff800364= 2200=2C=20 buf=3D0xfffffe00befda9c0=2C object=3D291417=2C tx=3D0xfffffe00151fc400) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:770 #14 0xffffffff81660070 in dsl_scan_visitbp (bp=3D0xffffff800359b900=2C=20 zb=3D0xffffff810c3d4cb0=2C dnp=3D0xfffffe0008076000=2C pbuf=3DVariable = "pbuf" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:718 #15 0xffffffff8165fd99 in dsl_scan_visitbp (bp=3D0xffffff80033e5380=2C=20 zb=3D0xffffff810c3d4e10=2C dnp=3D0xfffffe0008076000=2C pbuf=3DVariable = "pbuf" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:684 #16 0xffffffff8165fd99 in dsl_scan_visitbp (bp=3D0xffffff80033df000=2C=20 zb=3D0xffffff810c3d4f70=2C dnp=3D0xfffffe0008076000=2C pbuf=3DVariable = "pbuf" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:684 #17 0xffffffff8165fd99 in dsl_scan_visitbp (bp=3D0xffffff80033db000=2C=20 zb=3D0xffffff810c3d50d0=2C dnp=3D0xfffffe0008076000=2C pbuf=3DVariable = "pbuf" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:684 #18 0xffffffff8165fd99 in dsl_scan_visitbp (bp=3D0xffffff8003451000=2C=20 zb=3D0xffffff810c3d5230=2C dnp=3D0xfffffe0008076000=2C pbuf=3DVariable = "pbuf" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:684 #19 0xffffffff8165fd99 in dsl_scan_visitbp (bp=3D0xffffff80033d7000=2C=20 zb=3D0xffffff810c3d5390=2C dnp=3D0xfffffe0008076000=2C pbuf=3DVariable = "pbuf" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:684 #20 0xffffffff8165fd99 in dsl_scan_visitbp (bp=3D0xfffffe0008076040=2C=20 zb=3D0xffffff810c3d5420=2C dnp=3D0xfffffe0008076000=2C pbuf=3DVariable = "pbuf" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:684 #21 0xffffffff81660c84 in dsl_scan_visitdnode (scn=3D0xfffffe001523dc00=2C= =20 ds=3D0xfffffe0037abf400=2C ostype=3DDMU_OST_ZFS=2C dnp=3D0xfffffe000807= 6000=2C=20 buf=3D0xfffffe00375996e8=2C object=3D0=2C tx=3D0xfffffe00151fc400) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:770 #22 0xffffffff8165ff9a in dsl_scan_visitbp (bp=3D0xfffffe003729e280=2C=20 zb=3D0xffffff810c3d55f0=2C dnp=3D0x0=2C pbuf=3DVariable "pbuf" is not a= vailable. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:736 #23 0xffffffff816600d7 in dsl_scan_visit_rootbp (scn=3DVariable "scn" is no= t available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:872 #24 0xffffffff81660172 in dsl_scan_visitds (scn=3D0xfffffe001523dc00=2C dso= bj=3D21=2C=20 tx=3D0xfffffe00151fc400) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:1099 #25 0xffffffff81660695 in dsl_scan_sync (dp=3D0xfffffe0037335000=2C=20 tx=3D0xfffffe00151fc400) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/dsl_scan.c:1355 #26 0xffffffff81667e30 in spa_sync (spa=3D0xfffffe0008161000=2C txg=3D97010= ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/spa.c:5711 #27 0xffffffff81678749 in txg_sync_thread (arg=3DVariable "arg" is not avai= lable. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/f= s/zfs/txg.c:423 #28 0xffffffff808bb4cf in fork_exit ( callout=3D0xffffffff81678610 <txg_sync_thread>=2C arg=3D0xfffffe0037335= 000=2C=20 frame=3D0xffffff810c3d5c40) at /usr/src/sys/kern/kern_fork.c:992 #29 0xffffffff80bc2e8e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:602 #30 0x0000000000000000 in ?? () #31 0x0000000000000000 in ?? () #32 0x0000000000000001 in ?? () #33 0x0000000000000000 in ?? () #34 0x0000000000000000 in ?? () #35 0x0000000000000000 in ?? () #36 0x0000000000000000 in ?? () #37 0x0000000000000000 in ?? () #38 0x0000000000000000 in ?? () #39 0x0000000000000000 in ?? () #40 0x0000000000000000 in ?? () #41 0x0000000000000000 in ?? () #42 0x0000000000000000 in ?? () #43 0x0000000000000000 in ?? () #44 0x0000000000000000 in ?? () #45 0x0000000000000000 in ?? () #46 0x0000000000000000 in ?? () #47 0x0000000000000000 in ?? () #48 0x0000000000000000 in ?? () #49 0x0000000000000000 in ?? () #50 0x0000000000000000 in ?? () #51 0x0000000000000000 in ?? () #52 0x0000000000000000 in ?? () #53 0x0000000000000000 in ?? () #54 0x0000000000000005 in ?? () #55 0xffffffff81242b00 in tdq_cpu () #56 0xfffffe0015e9d470 in ?? () #57 0x0000000000000000 in ?? () #58 0xffffff810c3d4580 in ?? () #59 0xffffff810c3d4528 in ?? () #60 0xfffffe00028848e0 in ?? () #61 0xffffffff80912fce in sched_switch (td=3D0xfffffe00370b1470=2C=20 newtd=3D0xfffffe0037335000=2C flags=3DVariable "flags" is not available= . ) at /usr/src/sys/kern/sched_ule.c:1921 Previous frame inner to this frame (corrupt stack?) (kgdb)=20 pool: zzzz state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub canceled on Tue Aug 21 16:53:03 2012 config: NAME STATE READ WRITE CKSUM zzzz ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 ada3 ONLINE 0 0 0 ada7 ONLINE 0 0 0 ada6 ONLINE 0 0 0 ada9 ONLINE 0 0 0 ada4 ONLINE 0 0 0 ada2 ONLINE 0 0 0 ada5 ONLINE 0 0 0 errors: 4 data errors=2C use '-v' for a list The data errors will go away if the scrub completes=3B it has shown that be= fore. And yes=2C here: 'zpool clear zzzz' pool: zzzz state: ONLINE scan: scrub canceled on Tue Aug 21 17:02:53 2012 config: NAME STATE READ WRITE CKSUM zzzz ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 ada3 ONLINE 0 0 0 ada7 ONLINE 0 0 0 ada6 ONLINE 0 0 0 ada9 ONLINE 0 0 0 ada4 ONLINE 0 0 0 ada2 ONLINE 0 0 0 ada5 ONLINE 0 0 0 errors: No known data errors The machine passes 'memtest' memory check of over 12 hours. Bad disk ? One of the disks has command errors=2C but no pending sectors to reallocate in smartctl output=2C and there are no disk errors in /var/log/messages. =20 Two sata port multipliers. pmp0 at siisch0 bus 0 scbus6 target 15 lun 0 pmp0: <Port Multiplier 37261095 1706> ATA-0 device pmp0: 300.000MB/s transfers (SATA 2.x=2C NONE=2C PIO 8192bytes) pmp0: 5 fan-out ports pmp1 at siisch4 bus 0 scbus10 target 15 lun 0 pmp1: <Port Multiplier 37261095 1706> ATA-0 device pmp1: 300.000MB/s transfers (SATA 2.x=2C NONE=2C PIO 8192bytes) pmp1: 5 fan-out ports =
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BAY170-W8668C02B4DAF69B54EE657F9B80>