Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Feb 2019 08:19:25 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 235683] ZFS kernel panic when access to data or scrub
Message-ID:  <bug-235683-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D235683

            Bug ID: 235683
           Summary: ZFS kernel panic when access to data or scrub
           Product: Base System
           Version: 12.0-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: admin@5034.ru

Hi All,

I have test machine:

# uname -ar
FreeBSD server.5034.ru 12.0-STABLE FreeBSD 12.0-STABLE r343904 SERVER  amd64

This server has ZFS pool with error:

# zpool status -v
  pool: zroot
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub canceled on Mon Feb 11 20:50:59 2019
config:

        NAME         STATE     READ WRITE CKSUM
        zroot        ONLINE       0     0     0
          gpt/disk0  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        zroot:<0x21008>


Looks like this is kernel panic when OS try to get access to broken data:

(pts/2)[root@server:/usr/obj/usr/src/amd64.amd64/sys/SERVER]# kgdb kernel
/var/crash/vmcore.last
GNU gdb (GDB) 8.2.1 [GDB v8.2.1 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.htm=
l>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from kernel...Reading symbols from
/usr/obj/usr/src/amd64.amd64/sys/SERVER/kernel.debug...done.
done.

Unread portion of the kernel message buffer:
panic: Solaris(panic): blkptr at 0xfffffe0059108980 DVA 0 has invalid OFFSET
72057594038013952
cpuid =3D 2
time =3D 1549919776
KDB: stack backtrace:
#0 0xffffffff80c531c7 at kdb_backtrace+0x67
#1 0xffffffff80c07143 at vpanic+0x1a3
#2 0xffffffff80c06f93 at panic+0x43
#3 0xffffffff826d418f at vcmn_err+0xcf
#4 0xffffffff827849ca at zfs_panic_recover+0x5a
#5 0xffffffff827c0223 at zfs_blkptr_verify+0x303
#6 0xffffffff827c030c at zio_read+0x2c
#7 0xffffffff8270cc54 at arc_read+0x704
#8 0xffffffff827195ed at dbuf_read+0x72d
#9 0xffffffff8271d16f at __dbuf_hold_impl+0x57f
#10 0xffffffff8271d37f at dbuf_hold+0x7f
#11 0xffffffff827240ec at dmu_buf_hold_noread_by_dnode+0x3c
#12 0xffffffff827242ac at dmu_buf_hold_by_dnode+0x1c
#13 0xffffffff827a69cd at zap_get_leaf_byblk+0x4d
#14 0xffffffff827a404f at fzap_lookup+0xcf
#15 0xffffffff827aac67 at zap_lookup_impl+0x117
#16 0xffffffff827aaad5 at zap_lookup_norm+0xa5
#17 0xffffffff827aaa21 at zap_lookup+0x11
Uptime: 2h5m13s
Dumping 1087 out of 8077 MB:..2%..11%..21%..31%..42%..51%..61%..71%..81%..9=
2%

__curthread () at ./machine/pcpu.h:230
230             __asm("movq %%gs:%P1,%0" : "=3Dr" (td) : "n"
(OFFSETOF_CURTHREAD));
(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=3D<optimized out>) at /usr/src/sys/kern/kern_shutdown=
.c:366
#2  0xffffffff80c06d2b in kern_reboot (howto=3D260) at
/usr/src/sys/kern/kern_shutdown.c:446
#3  0xffffffff80c071a3 in vpanic (fmt=3D<optimized out>, ap=3D0xfffffe00681=
48e10)
at /usr/src/sys/kern/kern_shutdown.c:872
#4  0xffffffff80c06f93 in panic (fmt=3D<unavailable>) at
/usr/src/sys/kern/kern_shutdown.c:799
#5  0xffffffff826d418f in vcmn_err (ce=3D<optimized out>, fmt=3D0xffffffff8=
2878f37
"blkptr at %p DVA %u has invalid OFFSET %llu", adx=3D0xfffffe0068148fa0)
    at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:58
#6  0xffffffff827849ca in zfs_panic_recover (fmt=3D<unavailable>) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:1653
#7  0xffffffff827c0223 in zfs_blkptr_verify (spa=3D0xfffffe004fdba000,
bp=3D0xfffffe0059108980)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:835
#8  0xffffffff827c030c in zio_read (pio=3D0xfffff8015aeda830,
spa=3D0xfffffe004fdba000, bp=3D0xfffffe0059108980, data=3D0xfffff8020ac939c=
0,
size=3D16384,
    done=3D0xffffffff8270ddd0 <arc_read_done>, private=3D0xfffff8013c686700,
priority=3DZIO_PRIORITY_SYNC_READ, flags=3DZIO_FLAG_CANFAIL, zb=3D0xfffffe0=
068149120)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:879
#9  0xffffffff8270cc54 in arc_read (pio=3D0xfffff8015aeda830,
spa=3D0xfffffe004fdba000, bp=3D0xfffffe0059108980, done=3D0xfffff8002f39fbc=
0,
    private=3D0xfffff8011bff2580, priority=3DZIO_PRIORITY_SYNC_READ, zio_fl=
ags=3D128,
arc_flags=3D0xfffffe0068149164, zb=3D0xfffffe0068149120)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6080
#10 0xffffffff827195ed in dbuf_read_impl (db=3D<optimized out>, flags=3D<op=
timized
out>, zio=3D<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1310
#11 dbuf_read (db=3D0xfffff8002f5c0960, zio=3D0xfffff8015aeda830, flags=3D<=
optimized
out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1435
#12 0xffffffff8271d16f in dbuf_findbp (dn=3D<optimized out>, level=3D<optim=
ized
out>, blkid=3D10632, fail_sparse=3D<optimized out>, parentp=3D0xfffff801ad8=
ac038,
    bpp=3D<optimized out>, dh=3D0xfffff801ad8ac000) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:2506
#13 __dbuf_hold_impl (dh=3D0xfffff801ad8ac000) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:2931
#14 0xffffffff8271d37f in dbuf_hold_impl (dn=3D0xfffff8018ab6e000, level=3D0
'\000', blkid=3D10632, fail_sparse=3D0, fail_uncached=3D0, tag=3D0x0, dbp=
=3D0x0)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:3032
#15 dbuf_hold_level (dn=3D0xfffff8018ab6e000, level=3D0, blkid=3D10632, tag=
=3D0x0) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:3075
#16 dbuf_hold (dn=3D0xfffff8018ab6e000, blkid=3D10632, tag=3D0x0) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:3068
#17 0xffffffff827240ec in dmu_buf_hold_noread_by_dnode (dn=3D0xfffff8018ab6=
e000,
offset=3D<optimized out>, tag=3D<unavailable>, dbp=3D0xfffffe00681492c8)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:160
#18 0xffffffff827242ac in dmu_buf_hold_by_dnode (dn=3D<unavailable>,
offset=3D<unavailable>, tag=3D0x0, dbp=3D0xfffffe00681492c8, flags=3D1)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:208
#19 0xffffffff827a69cd in zap_get_leaf_byblk (zap=3D0xfffff80111de9200,
blkid=3D10632, tx=3D0x0, lt=3DRW_READER, lp=3D0xfffffe0068149340)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:498
#20 0xffffffff827a404f in zap_deref_leaf (zap=3D0xfffff80111de9200, h=3D<op=
timized
out>, tx=3D<unavailable>, lt=3DRW_READER, lp=3D0xfffffe00681493e0)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:580
#21 fzap_lookup (zn=3D0xfffff8011ce98200, integer_size=3D8, num_integers=3D=
1,
buf=3D0xfffffe0068149490, realname=3D0x0, rn_len=3D0, ncp=3D0x0)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:761
#22 0xffffffff827aac67 in zap_lookup_impl (zap=3D0xfffff80111de9200,
name=3D<optimized out>, integer_size=3D8, num_integers=3D1, buf=3D0xfffffe0=
068149490,
    mt=3D<optimized out>, realname=3D0x0, rn_len=3D0, ncp=3D0x0) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:915
#23 0xffffffff827aaad5 in zap_lookup_norm (os=3D<optimized out>,
zapobj=3D<optimized out>, name=3D<optimized out>, integer_size=3D8, num_int=
egers=3D1,
    buf=3D0xfffffe0068149490, mt=3D(unknown: 0), realname=3D0x0, rn_len=3D0=
, ncp=3D0x0)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:954
#24 0xffffffff827aaa21 in zap_lookup (os=3D<unavailable>, zapobj=3D<unavail=
able>,
name=3D<unavailable>, integer_size=3D<unavailable>, num_integers=3D<unavail=
able>,
    buf=3D<unavailable>) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:898
#25 0xffffffff827d41f9 in zfs_match_find (zfsvfs=3D<optimized out>,
dzp=3D<optimized out>, name=3D<optimized out>, mt=3D<optimized out>, zoid=
=3D<optimized
out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:81
#26 zfs_dirent_lookup (dzp=3D0xfffff8003812a110, name=3D0xfffffe0068149610
"0_NCiHyz6S.xml", zpp=3D0xfffffe00681494d8, flag=3D2)
--Type <RET> for more, q to quit, c to continue without paging--RET
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:177
#27 0xffffffff827d42e7 in zfs_dirlook (dzp=3D0xfffff8003812a110,
name=3D<unavailable>, zpp=3D0xfffffe0068149590)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:238
#28 0xffffffff827edacf in zfs_lookup (dvp=3D<optimized out>, nm=3D<optimize=
d out>,
vpp=3D<optimized out>, cnp=3D<optimized out>, nameiop=3D<optimized out>,
    cr=3D<optimized out>, td=3D<optimized out>, flags=3D<optimized out>) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1658
#29 0xffffffff827ee1fe in zfs_freebsd_lookup (ap=3D0xfffffe0068149778) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4960
#30 0xffffffff81314ba8 in VOP_CACHEDLOOKUP_APV (vop=3D<optimized out>,
a=3D0xfffffe0068149778) at vnode_if.c:195
#31 0xffffffff80cc0846 in VOP_CACHEDLOOKUP (dvp=3D<optimized out>, vpp=3D<o=
ptimized
out>, cnp=3D<optimized out>) at ./vnode_if.h:80
#32 vfs_cache_lookup (ap=3D<optimized out>) at /usr/src/sys/kern/vfs_cache.=
c:2109
#33 0xffffffff81314a78 in VOP_LOOKUP_APV (vop=3D<optimized out>,
a=3D0xfffffe0068149840) at vnode_if.c:127
#34 0xffffffff80cc9fa1 in VOP_LOOKUP (vpp=3D0xfffffe00681499d8,
cnp=3D0xfffffe0068149a00, dvp=3D<optimized out>) at ./vnode_if.h:54
#35 lookup (ndp=3D0xfffffe0068149978) at /usr/src/sys/kern/vfs_lookup.c:879
#36 0xffffffff80cc948b in namei (ndp=3D0xfffffe0068149978) at
/usr/src/sys/kern/vfs_lookup.c:444
#37 0xffffffff80ce06a6 in kern_accessat (td=3D<optimized out>, fd=3D-100,
path=3D<optimized out>, pathseg=3D<optimized out>, flag=3D<optimized out>, =
amode=3D0)
    at /usr/src/sys/kern/vfs_syscalls.c:1986
#38 0xffffffff8118e592 in syscallenter (td=3D<optimized out>) at
/usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#39 amd64_syscall (td=3D0xfffff8011bff2580, traced=3D0) at
/usr/src/sys/amd64/amd64/trap.c:1154
#40 <signal handler called>
#41 0x00000008003e703a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffd628
(kgdb) frame 9
#9  0xffffffff8270cc54 in arc_read (pio=3D0xfffff8015aeda830,
spa=3D0xfffffe004fdba000, bp=3D0xfffffe0059108980, done=3D0xfffff8002f39fbc=
0,
    private=3D0xfffff8011bff2580, priority=3DZIO_PRIORITY_SYNC_READ, zio_fl=
ags=3D128,
arc_flags=3D0xfffffe0068149164, zb=3D0xfffffe0068149120)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6080
6080                    rzio =3D zio_read(pio, spa, bp, hdr->b_l1hdr.b_pabd,
size,
(kgdb) frame 8
#8  0xffffffff827c030c in zio_read (pio=3D0xfffff8015aeda830,
spa=3D0xfffffe004fdba000, bp=3D0xfffffe0059108980, data=3D0xfffff8020ac939c=
0,
size=3D16384,
    done=3D0xffffffff8270ddd0 <arc_read_done>, private=3D0xfffff8013c686700,
priority=3DZIO_PRIORITY_SYNC_READ, flags=3DZIO_FLAG_CANFAIL, zb=3D0xfffffe0=
068149120)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:879
879             zfs_blkptr_verify(spa, bp);
(kgdb) frame 7
#7  0xffffffff827c0223 in zfs_blkptr_verify (spa=3D0xfffffe004fdba000,
bp=3D0xfffffe0059108980)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:835
835                             zfs_panic_recover("blkptr at %p DVA %u has
invalid "
(kgdb) frame 6
#6  0xffffffff827849ca in zfs_panic_recover (fmt=3D<unavailable>) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:1653
1653            vcmn_err(zfs_recover ? CE_WARN : CE_PANIC, fmt, adx);
(kgdb) frame 5
#5  0xffffffff826d418f in vcmn_err (ce=3D<optimized out>, fmt=3D0xffffffff8=
2878f37
"blkptr at %p DVA %u has invalid OFFSET %llu", adx=3D0xfffffe0068148fa0)
    at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:58
58                      panic("%s%s", prefix, buf);
(kgdb) frame 4
#4  0xffffffff80c06f93 in panic (fmt=3D<unavailable>) at
/usr/src/sys/kern/kern_shutdown.c:799
799             vpanic(fmt, ap);
(kgdb)


Also, I tried to start scrub, but the server gone to reboot loop. Write dum=
p,
reboot, write dump, reboot and etc. I had to stop scrub manually.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-235683-227>