From owner-freebsd-questions@freebsd.org  Tue Jul 16 00:22:58 2019
Return-Path: <owner-freebsd-questions@freebsd.org>
Delivered-To: freebsd-questions@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id F19CBC76DD
 for <freebsd-questions@mailman.nyi.freebsd.org>;
 Tue, 16 Jul 2019 00:22:58 +0000 (UTC)
 (envelope-from lee@adminart.net)
Received: from mo6-p00-ob.smtp.rzone.de (mo6-p00-ob.smtp.rzone.de
 [IPv6:2a01:238:20a:202:5300::8])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "*.smtp.rzone.de",
 Issuer "TeleSec ServerPass Class 2 CA" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 3595387363
 for <freebsd-questions@freebsd.org>; Tue, 16 Jul 2019 00:22:58 +0000 (UTC)
 (envelope-from lee@adminart.net)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1563236575;
 s=strato-dkim-0002; d=adminart.net;
 h=References:Message-ID:Date:In-Reply-To:Subject:Cc:To:From:
 X-RZG-CLASS-ID:X-RZG-AUTH:From:Subject:Sender;
 bh=340JAxiumE8iQoBVf7HDrYCoilV9UzCqeHfIz8Bewzg=;
 b=MkdpLpcZ0Ni5vuabtksW/ApQH6hAwFWXEAiQTSmJXHT639xqMyFF7VKykiXDvzFWp6
 v3JgkVVnUSO2+CRe+53Nuwkj7WOvMzgsHrTTLSYhJipiSwM0fiJWGxixn4d72HKl71JN
 cPbswKB5h2h0QhzhoT4qb8ZLgaB7lMS7CO01DMc+7R0Iu6I+PussfZN9HEwSKnVtL3Lo
 kkEe0sOGJcmwpoFZ07dfRWbO7b0vjwFktSkATG+CtioFN4CR7s9bPzCrh0uv9WE7ISar
 4YYnXkaMVlcT8DGfd4fXbshInLGRyJIkM+lj9mXr+v/5W4Nv32RBssOnutqnmFAjrlZG
 GkLQ==
X-RZG-AUTH: ":O2kGeEG7b/pS1FS4THaxjVF9w0vVgfQ9xGcjwO5WMRo5c+h5ceMqQWZ3yrBp+ARdaXvxIDf7nlw="
X-RZG-CLASS-ID: mo00
Received: from himinbjorg.adminart.net
 by smtp.strato.de (RZmta 44.24 DYNA|AUTH)
 with ESMTPSA id e0059dv6G0MtVwN
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (curve secp521r1 with
 521 ECDH bits, eq. 15360 bits RSA))
 (Client did not present a certificate);
 Tue, 16 Jul 2019 02:22:55 +0200 (CEST)
Received: from toy.adminart.net ([192.168.3.55])
 by himinbjorg.adminart.net with esmtps
 (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.92)
 (envelope-from <lee@adminart.net>)
 id 1hnBF8-000139-FL; Tue, 16 Jul 2019 02:22:54 +0200
Received: from lee by toy.adminart.net with local (Exim 4.92)
 (envelope-from <lee@toy.adminart.net>)
 id 1hnBF7-0000bC-TO; Tue, 16 Jul 2019 02:22:54 +0200
From: hw <hw@adminart.net>
To: "Kevin P. Neal" <kpn@neutralgood.org>
Cc: freebsd-questions@freebsd.org
Subject: Re: dead slow update servers
In-Reply-To: <20190715175108.GC31450@neutralgood.org> (Kevin P. Neal's message
 of "Mon, 15 Jul 2019 13:51:08 -0400")
Date: Tue, 16 Jul 2019 02:19:47 +0200
Organization: my virtual residence
Message-ID: <87k1cij0f0.fsf@toy.adminart.net>
References: <CAGLDxTW8zw2d+aBGOmBgEhipjq6ocn536fH_NScMiDD7hD=eSw@mail.gmail.com>
 <874l3qfvqw.fsf@toy.adminart.net>
 <20190714011303.GA25317@neutralgood.org>
 <87v9w58apd.fsf@toy.adminart.net>
 <f7d8acd6-6adb-2b4b-38ef-dc988d7d96a7@denninger.net>
 <87v9w4qjy8.fsf@toy.adminart.net>
 <20190715014129.GA62729@neutralgood.org>
 <87ftn8otem.fsf@toy.adminart.net>
 <20190715151621.GB31450@neutralgood.org>
 <87blxvjn4a.fsf@toy.adminart.net>
 <20190715175108.GC31450@neutralgood.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Rspamd-Queue-Id: 3595387363
X-Spamd-Bar: ---
Authentication-Results: mx1.freebsd.org;
 dkim=pass header.d=adminart.net header.s=strato-dkim-0002 header.b=MkdpLpcZ
X-Spamd-Result: default: False [-3.71 / 15.00]; ARC_NA(0.00)[];
 RCVD_VIA_SMTP_AUTH(0.00)[];
 R_DKIM_ALLOW(-0.20)[adminart.net:s=strato-dkim-0002];
 NEURAL_HAM_MEDIUM(-0.99)[-0.988,0]; FROM_HAS_DN(0.00)[];
 TO_DN_SOME(0.00)[]; NEURAL_HAM_SHORT(-0.89)[-0.885,0];
 NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain];
 DMARC_NA(0.00)[adminart.net]; HAS_ORG_HEADER(0.00)[];
 RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 DKIM_TRACE(0.00)[adminart.net:+]; RCPT_COUNT_TWO(0.00)[2];
 MX_GOOD(-0.01)[cached: smtpin.rzone.de]; R_SPF_NA(0.00)[];
 FORGED_SENDER(0.30)[hw@adminart.net,lee@adminart.net];
 RCVD_IN_DNSWL_LOW(-0.10)[8.0.0.0.0.0.0.0.0.0.0.0.0.0.3.5.2.0.2.0.a.0.2.0.8.3.2.0.1.0.a.2.list.dnswl.org
 : 127.0.5.1]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[];
 ASN(0.00)[asn:6724, ipnet:2a01:238::/32, country:DE];
 FROM_NEQ_ENVFROM(0.00)[hw@adminart.net,lee@adminart.net];
 IP_SCORE(-0.73)[ipnet: 2a01:238::/32(-3.23), asn: 6724(-0.41),
 country: DE(-0.01)]
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions/>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Jul 2019 00:22:59 -0000

"Kevin P. Neal" <kpn@neutralgood.org> writes:

> On Mon, Jul 15, 2019 at 06:09:25PM +0200, hw wrote:
>> "Kevin P. Neal" <kpn@neutralgood.org> writes:
>> 
>> > On Mon, Jul 15, 2019 at 05:42:25AM +0200, hw wrote:
>> >> "Kevin P. Neal" <kpn@neutralgood.org> writes:
>> >> > Oh, and my Dell machines are old enough that I'm stuck with the hardware
>> >> > RAID controller. I use ZFS and have raid0 arrays configured with single
>> >> > drives in each. I _hate_ it. When a drive fails the machine reboots and
>> >> > the controller hangs the boot until I drive out there and dump the card's
>> >> > cache. It's just awful.
>> >> 
>> >> That doesn't sound like a good setup.  Usually, nothing reboots when a
>> >> drive fails.
>> >> 
>> >> Would it be a disadvantage to put all drives into a single RAID10 (or
>> >> each half of them into one) and put ZFS on it (or them) if you want to
>> >> keep ZFS?
>> >
>> > Well, it still leaves me with the overhead of dealing with creating arrays
>> > in the hardware.
>> 
>> Didn't you need to create the RAID0s having a single disk, too?
>
> Yes, and that wouldn't change if I took your suggestion. Which is why I
> phrased it as "it still leaves" which says that the problem already exists
> and would continue to exist.

So it's not more work as otherwise --- maybe even less because you have
only half the number of logical drives, or less.

>> > And it costs me loss of the scrubbing/verification of the end-to-end
>> > checksumming. So I'm less safe there with no less work.
>> 
>> If you're worried about the controller giving results that lead to the
>> correct check sums and data ending up on the disk not matching these
>> check sums when the controller reads it later, what difference does it
>> make which kind of RAID you use?  You can always run a scrub to verify
>> the check sums, and if errors are being found, you may need to replace
>> the controller.
>
> ZFS can correct checksum errors so long as the array is still valid. Is
> there a hardware RAID card that does that?

I'm not sure, some controllers do what they call surface checking in the
background.  It's the job of the controller to make sure the data on the
disk is fine, and I have no reason to assume that it doesn't do that.

>> > Maybe I should just go ahead and change it. I've got a drive about to
>> > fail on me. It's a three way mirror so I'm not worried about it. It would
>> > be, uh, _nice_ if it didn't bring down the machine, though.
>> 
>> If you were using two or more disks each in a RAID1 or RAID10 to create
>> one disk exposed to ZFS, you wouldn't have a problem when one disk
>> becomes unresponsive.  If there's someone around who is used to quickly
>
> True, but if I run a scrub I want to verify _all_ the data on the disks,
> not just the data exposed by the RAID controller card.

Why?  Data that is never exposed is irrelevant for this and doesn't need
to be verified.

Of the logical volumes you're using now, the controller card exposes no
more than it does, so what difference does it make of how many disks the
volume is made of?  You don't get any more or less data exposed as the
controller does its thing regardless.

Besides, how do you suggest to get all data exposed?  Do you have
special firmware on your disks that exposes all data, and for everything
else that is involved?

> That means ZFS needs to see _all_ the disks. Otherwise I would be
> vulnerable to loss of data due to checksum errors that may only be
> seen after a drive dies. At that point it's too late to correct.

I don't understand how it would make a difference whether ZFS can see
all physical disks or not.  One way or another, there is merely a
storage device ZFS is using and can see, and if the device is
errorneous, it shouldn't matter what the device is physically made of.

When a disk fails and is replaced, the hardware RAID is being rebuilt,
and when ZFS detects checksum errors on that storage device, why
shouldn't it be able to correct the errors like with any other storage
device?  It's even not the storage device that failed but only a disk.

What if you had hybrid disks consisting of some flash memory and
magnetic disks?  You would have to conclude that they can never be used
with ZFS because they could introduce hidden checksum errors.  What
about the cache of your disks, do you turn it off because hidden
checksum errors could be introduced when there is a mismatch between the
data on the disk and what is being exposed by the cache?  IIRC, one of
the advantages of ZFS is that you don't need to disable the cache.

RAID controllers and such aren't designed to create hidden checksum
errors ZFS is unable to correct.  ZFS was designed to detect errors and
to correct them, using check sums.  There's no need to be afraid of RAID
controllers and such.

>> Hardware RAID does have advantages, so why not use them when you're
>> stuck with it anyway?
>
> Because the downsides outweigh the upside.

Like how?  The whole setup seems very questionable, and the machine goes
out of service right away when a disk becomes unresponsive.  One of the
purposes of RAID is to prevent just that.