From owner-freebsd-bugs@freebsd.org  Mon Aug 26 20:29:46 2019
Return-Path: <owner-freebsd-bugs@freebsd.org>
Delivered-To: freebsd-bugs@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id 1C8D4E4576
 for <freebsd-bugs@mailman.nyi.freebsd.org>;
 Mon, 26 Aug 2019 20:29:46 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3])
 by mx1.freebsd.org (Postfix) with ESMTP id 46HNsP72CDz3Ny6
 for <freebsd-bugs@freebsd.org>; Mon, 26 Aug 2019 20:29:45 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: by mailman.nyi.freebsd.org (Postfix)
 id EF5EAE4575; Mon, 26 Aug 2019 20:29:45 +0000 (UTC)
Delivered-To: bugs@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id EF237E4574
 for <bugs@mailman.nyi.freebsd.org>; Mon, 26 Aug 2019 20:29:45 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org
 [IPv6:2610:1c1:1:606c::19:3])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (4096 bits) client-digest SHA256)
 (Client CN "mxrelay.nyi.freebsd.org",
 Issuer "Let's Encrypt Authority X3" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 46HNsP66JJz3Ny5
 for <bugs@FreeBSD.org>; Mon, 26 Aug 2019 20:29:45 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2610:1c1:1:606c::50:1d])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B0666B970
 for <bugs@FreeBSD.org>; Mon, 26 Aug 2019 20:29:45 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org ([127.0.1.5])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id x7QKTjNj088101
 for <bugs@FreeBSD.org>; Mon, 26 Aug 2019 20:29:45 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
Received: (from www@localhost)
 by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id x7QKTj9t088098
 for bugs@FreeBSD.org; Mon, 26 Aug 2019 20:29:45 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
X-Authentication-Warning: kenobi.freebsd.org: www set sender to
 bugzilla-noreply@freebsd.org using -f
From: bugzilla-noreply@freebsd.org
To: bugs@FreeBSD.org
Subject: [Bug 240134] [ZFS] Kernel panic while importing zpool (blkptr at
 <addr> has invalid COMPRESS 127)
Date: Mon, 26 Aug 2019 20:29:45 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 12.0-RELEASE
X-Bugzilla-Keywords: panic
X-Bugzilla-Severity: Affects Only Me
X-Bugzilla-Who: demik+freebsd@lostwave.net
X-Bugzilla-Status: New
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: bugs@FreeBSD.org
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform
 op_sys bug_status keywords bug_severity priority component assigned_to
 reporter
Message-ID: <bug-240134-227@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-bugs>,
 <mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs/>
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
 <mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Aug 2019 20:29:46 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D240134

            Bug ID: 240134
           Summary: [ZFS] Kernel panic while importing zpool (blkptr at
                    <addr> has invalid COMPRESS 127)
           Product: Base System
           Version: 12.0-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Keywords: panic
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: demik+freebsd@lostwave.net

Hello,

One of my systems is stuck in a reboot loop. Kernel Panic every time while
importing zpool (root-on-ZFS). Root pool is ZFS mirror (

This happened a few days (hours ?) after upgrading the root pool from FreeB=
SD
11 to 12. Not sure if its related or not.=20

The issue is reproducible on other systems (ZFS mirror). Tried a set of x86=
_64
and powerpc64 systems: same issue everywhere.

Here is the kernel panic:

ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
Solaris: WARNING: blkptr at 0xfffffe001be3a800 has invalid COMPRESS 127
Solaris: WARNING: blkptr at 0xfffffe001be3a800 has invalid ETYPE 255


Fatal trap 12: page fault while in kernel mode
cpuid =3D 0; apic id =3D 00
fault virtual address   =3D 0x88
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff828f01b5
stack pointer           =3D 0x28:0xfffffe00005f5710
frame pointer           =3D 0x28:0xfffffe00005f5750
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 828 (zpool)
trap number             =3D 12
panic: page fault
cpuid =3D 0
time =3D 1566854520
KDB: stack backtrace:
#0 0xffffffff80be78d7 at kdb_backtrace+0x67
#1 0xffffffff80b9b4b3 at vpanic+0x1a3
#2 0xffffffff80b9b303 at panic+0x43
#3 0xffffffff81074bff at trap_fatal+0x35f
#4 0xffffffff81074c59 at trap_pfault+0x49
#5 0xffffffff8107427e at trap+0x29e
#6 0xffffffff8104f625 at calltrap+0x8
#7 0xffffffff8290267a at zio_checksum_verify+0x6a
#8 0xffffffff828fe2ec at zio_execute+0xbc
#9 0xffffffff82901d2c at zio_vdev_io_start+0x15c
#10 0xffffffff828fe2ec at zio_execute+0xbc
#11 0xffffffff828fdbfb at zio_nowait+0xcb
#12 0xffffffff82849c89 at arc_read+0x759
#13 0xffffffff8287353d at traverse_prefetch_metadata+0xbd
#14 0xffffffff828729ee at traverse_visitbp+0x3be
#15 0xffffffff82873623 at traverse_dnode+0xd3
#16 0xffffffff82872fa8 at traverse_visitbp+0x978
#17 0xffffffff82872a51 at traverse_visitbp+0x421
Uptime: 2m42s
(da1:umass-sim0:0:0:0): Synchronize cache failed
Dumping 161 out of 2009 MB:..10%..20%..30%..40%..50%..60%..70%..80%..90%..1=
00%
Dump complete

Server was stable before this, did check the following :
- none of the usual zpool rescue import options works (-F, -X, etc=E2=80=A6)
- mem testing: no errors
- checked both drives for bad sectors: nothing
- tried importing on ZoL v0.7.12 : PANIC(), the backtrace is somewhat diffe=
rent

After dd'ing a few TBs, The issue is reproduced easily inside a virtual
machine. Both drives seems to have the exact same corruption, so that's not=
 a
drive issue (different vendors, one entreprise drive)

Looks like we have two issues there:
- The first that caused the corruption. Trying to reproduce this (probably
non-ECC memory though)
- The second is KP() while importing the pool (this bug report)

Did more testing using zdb my limited knowledge. Issue is reproductible with
zdb:

zdb -AAA -e -ddd zroot/usr/local
Assertion failed: (!BP_IS_EMBEDDED(bp) || BPE_GET_ETYPE(bp) =3D=3D
BP_EMBEDDED_TYPE_DATA), file
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c, line 5724.
Assertion failed: ((hdr)->b_lsize << 9) > 0 (0x0 > 0x0), file
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c, line 3340.
Assertion failed: ((hdr)->b_lsize << 9) !=3D 0 (0x0 !=3D 0x0), file
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c, line 2447.
Assertion failed: (bytes > 0), file
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c, line 5032.
Assertion failed: ((hdr)->b_lsize << 9) !=3D 0 (0x0 !=3D 0x0), file
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c, line 2447.
Assertion failed: ((hdr)->b_lsize << 9) !=3D 0 (0x0 !=3D 0x0), file
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c, line 2447.
WARNING: blkptr at 0x80b124840 has invalid COMPRESS 127
WARNING: blkptr at 0x80b124840 has invalid ETYPE 255
Assertion failed: (!BP_IS_EMBEDDED(bp)), file
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c, line 1321.
Assertion failed: (zio->io_error !=3D 0), file
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c, line
660.
Assertion failed: (zio->io_vd !=3D NULL), file
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c, line 3619.

Objects from 51200 to 51231 on dataset zroot/usr/local are crashing zdb.
Anything else is fine.

Bonus question: is there a way to nuke this dataset to recover recent files=
 ?

Core dumps available if needed. Willing to test a few patches since I've
reproduced this in a lab.

Thanks for your help.

--=20
You are receiving this mail because:
You are the assignee for the bug.=