From owner-freebsd-bugs@freebsd.org  Thu Nov 16 06:04:25 2017
Return-Path: <owner-freebsd-bugs@freebsd.org>
Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id EA6ABDBE7A8
 for <freebsd-bugs@mailman.ysv.freebsd.org>;
 Thu, 16 Nov 2017 06:04:25 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org
 [IPv6:2001:1900:2254:206a::16:76])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id D95CE6E20B
 for <freebsd-bugs@FreeBSD.org>; Thu, 16 Nov 2017 06:04:25 +0000 (UTC)
 (envelope-from bugzilla-noreply@freebsd.org)
Received: from bugs.freebsd.org ([127.0.1.118])
 by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id vAG64PZS010628
 for <freebsd-bugs@FreeBSD.org>; Thu, 16 Nov 2017 06:04:25 GMT
 (envelope-from bugzilla-noreply@freebsd.org)
From: bugzilla-noreply@freebsd.org
To: freebsd-bugs@FreeBSD.org
Subject: [Bug 223699] ZFS drive loss during write operation causes kernel panic
Date: Thu, 16 Nov 2017 06:04:25 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: kern
X-Bugzilla-Version: 11.1-RELEASE
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: Affects Only Me
X-Bugzilla-Who: abrahamd@cat.pdx.edu
X-Bugzilla-Status: New
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform
 op_sys bug_status bug_severity priority component assigned_to reporter
Message-ID: <bug-223699-8@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-bugs>,
 <mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs/>
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
 <mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Nov 2017 06:04:26 -0000

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D223699

            Bug ID: 223699
           Summary: ZFS drive loss during write operation causes kernel
                    panic
           Product: Base System
           Version: 11.1-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs@FreeBSD.org
          Reporter: abrahamd@cat.pdx.edu

Environment:
OS: FreeBSD 11.1-RELEASE-p4
Board: Supermicro X10DRH-i
Manufacturer: Silicon Mechanics
RAID Controller: LSI 9341-8i HBA

Description:
Server has an attached storage zpool consisting of 8 disks. (Separate from =
the
root pool, which is on a different controller.) The storage pool is configu=
red
with RAIDZ2 fault tolerance. zpool status before crash is healthy. When doi=
ng
routine zfs setup testing (pulling a disk to verify pool integrity), while a
write operation is in progress to the storage pool, a kernel panic is
experienced. This behavior has been observed to be consistently repeatable.

How-to-repeat:
Boot server with attached storage zpool. Begin a write operation to storage
zpool (we use 'yes > file'). Pull a disk from the storage zpool to simulate
drive loss. Kernel panic follows. (repeated multiple times in succession in=
 our
testing while diagnosing issue.)

Trace follows:
flows01# kgdb kernel.debug /var/crash/vmcore.0
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain condition=
s.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
mfi0: I/O error, cmd=3D0xfffffe000148d760, status=3D0xc, scsi_status=3D0
mfi0: sense error 0, sense_key 0, asc 0, ascq 0
mfisyspd0: hard error cmd=3Dwrite 927680-927765


Fatal trap 12: page fault while in kernel mode
cpuid =3D 11; apic id =3D 0b
fault virtual address   =3D 0x8
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff809b9f74
stack pointer           =3D 0x28:0xfffffe0f84318930
frame pointer           =3D 0x28:0xfffffe0f84318970
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 12 (irq264: mfi0)
trap number             =3D 12
panic: page fault
cpuid =3D 11
KDB: stack backtrace:
#0 0xffffffff80aadac7 at kdb_backtrace+0x67
#1 0xffffffff80a6bba6 at vpanic+0x186
#2 0xffffffff80a6ba13 at panic+0x43
#3 0xffffffff80edf832 at trap_fatal+0x322
#4 0xffffffff80edf889 at trap_pfault+0x49
#5 0xffffffff80edf0c6 at trap+0x286
#6 0xffffffff80ec36d1 at calltrap+0x8
#7 0xffffffff80620f2c at mfi_tbolt_complete_cmd+0x13c
#8 0xffffffff80620d94 at mfi_intr_tbolt+0x54
#9 0xffffffff80a321ec at intr_event_execute_handlers+0xec
#10 0xffffffff80a324d6 at ithread_loop+0xd6
#11 0xffffffff80a2f845 at fork_exit+0x85
#12 0xffffffff80ec3c0e at fork_trampoline+0xe
Uptime: 1m1s
Dumping 2498 out of 65230 MB:mfi0: cmd_tbolt 0xfffff8000fa0f880 has invalid
sync_cmd_idx=3D128 - skipping
..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/zfs.ko...Reading symbols from
/usr/lib/debug//boot/kernel/zfs.ko.debug...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from
/usr/lib/debug//boot/kernel/opensolaris.ko.debug...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/ums.ko...Reading symbols from
/usr/lib/debug//boot/kernel/ums.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ums.ko
#0  0xffffffff80a6b98a in doadump (textdump=3D<value optimized out>) at
/usr/src/sys/kern/kern_shutdown.c:311
311             dumping--;
(kgdb) list *0xffffffff809b9f74
0xffffffff809b9f74 is in g_disk_done (/usr/src/sys/geom/geom_disk.c:252).
247             default:
248                     break;
249             }
250             bp2->bio_inbed++;
251             if (bp2->bio_children =3D=3D bp2->bio_inbed) {
252                     mtx_unlock(&sc->done_mtx);
253                     bp2->bio_resid =3D bp2->bio_bcount - bp2->bio_compl=
eted;
254                     g_io_deliver(bp2, bp2->bio_error);
255             } else
256                     mtx_unlock(&sc->done_mtx);
Current language:  auto; currently minimal

--=20
You are receiving this mail because:
You are the assignee for the bug.=