From owner-freebsd-current@freebsd.org  Wed Mar 21 09:29:01 2018
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 50C77F57685
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Wed, 21 Mar 2018 09:29:01 +0000 (UTC)
 (envelope-from fbsd-lists@dudes.ch)
Received: from mail.dudes.ch (mail.dudes.ch [193.73.211.25])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "*.dudes.ch", Issuer "StartCom Class 3 OV Server CA" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id D52B58578D
 for <freebsd-current@freebsd.org>; Wed, 21 Mar 2018 09:29:00 +0000 (UTC)
 (envelope-from fbsd-lists@dudes.ch)
Received: from mwoffice.virtualtec.office (pippin.virtualtec.ch
 [93.189.66.120]) (authenticated bits=0)
 by mail.dudes.ch (8.15.2/8.15.2) with ESMTPSA id w2LASjvY047816
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO)
 for <freebsd-current@freebsd.org>; Wed, 21 Mar 2018 11:28:45 +0100 (CET)
 (envelope-from fbsd-lists@dudes.ch)
X-Authentication-Warning: mail.dudes.ch: Host pippin.virtualtec.ch
 [93.189.66.120] claimed to be mwoffice.virtualtec.office
Date: Wed, 21 Mar 2018 10:28:48 +0100
From: Markus Wild <fbsd-lists@dudes.ch>
To: freebsd-current@freebsd.org
Subject: Re: ZFS i/o error in recent 12.0
Message-ID: <20180321102848.20a9f48a@mwoffice.virtualtec.office>
In-Reply-To: <f0331ee0-b013-d517-3714-a60ab357b913@gibfest.dk>
References: <201803192300.w2JN04fx007127@kx.openedu.org>
 <20180320085028.0b15ff40@mwoffice.virtualtec.office>
 <f0331ee0-b013-d517-3714-a60ab357b913@gibfest.dk>
X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.31; amd64-portbld-freebsd11.1)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.78 on 193.73.211.25
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Mar 2018 09:29:01 -0000

Hello Thomas,

> > I had faced the exact same issue on a HP Microserver G8 with 8TB disks and a 16TB zpool on FreeBSD 11 about a year
> > ago.  
> I will ask you the same question as I asked the OP:
> 
> Has this pool had new vdevs addded to it since the server was installed?

No. This is a microserver with only 4 (not even hotplug) trays. It was set up using the freebsd installer 
originally. I had to apply the (then patch, don't know whether it's included standard now) btx loader fix to retry
a failed read to get around BIOS bugs with that server, but after that, the server booted fine. It's only after
a bit of use and a kernel update that things went south. I tried many different things at that time, but the only
approach that worked for me was to steal 2 of the 4 swap partitions which I placed on every disk initially, and 
build a mirrored boot zpool from those. The loader had no problem loading the kernel from that, and when the kernel
took over, it had no problem using the original root pool (that the boot loader wasn't able to find/load). Whence my
conclusion that the 2nd stage boot loader has a problem (probably due to yet another bios bug on that server) loading
blocks beyond a certain limit, which could be 2TB or 4TB.

> What does a "zpool status" look like when the pool is imported?

$ zpool status
  pool: zboot
 state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Wed Mar 21 03:58:36 2018
config:

        NAME               STATE     READ WRITE CKSUM
        zboot              ONLINE       0     0     0
          mirror-0         ONLINE       0     0     0
            gpt/zfs-boot0  ONLINE       0     0     0
            gpt/zfs-boot1  ONLINE       0     0     0

errors: No known data errors

  pool: zroot
 state: ONLINE
  scan: scrub repaired 0 in 6h49m with 0 errors on Sat Mar 10 10:17:49 2018
config:

        NAME          STATE     READ WRITE CKSUM
        zroot         ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            gpt/zfs0  ONLINE       0     0     0
            gpt/zfs1  ONLINE       0     0     0
          mirror-1    ONLINE       0     0     0
            gpt/zfs2  ONLINE       0     0     0
            gpt/zfs3  ONLINE       0     0     0

errors: No known data errors

Please note: this server is in use at a customer now, it's workin fine with this workaround. I just brought it up 
to give a possible explanation to the observed problem of the original poster, and that it _might_ have nothing to do
with a newer version of the current kernel, but rather be due to the updated kernel being written to a new location
on disk, which can't be read properly by the boot loader.

Cheers,
Markus