From owner-freebsd-fs@freebsd.org  Thu Sep 21 20:00:11 2017
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 93448E1E63A
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu, 21 Sep 2017 20:00:11 +0000 (UTC)
 (envelope-from bsd@vink.pl)
Received: from mail-qk0-x234.google.com (mail-qk0-x234.google.com
 [IPv6:2607:f8b0:400d:c09::234])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 4DF1367D2B
 for <freebsd-fs@freebsd.org>; Thu, 21 Sep 2017 20:00:11 +0000 (UTC)
 (envelope-from bsd@vink.pl)
Received: by mail-qk0-x234.google.com with SMTP id t184so6872444qke.10
 for <freebsd-fs@freebsd.org>; Thu, 21 Sep 2017 13:00:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=vink-pl.20150623.gappssmtp.com; s=20150623;
 h=mime-version:from:date:message-id:subject:to;
 bh=u38NirGb+SPp0kstCaRyyK9QqwUGMCyYvnGV88SRvzo=;
 b=IKHL+VficnwtlUSqDYmPLA0KLb4hiqxPxcldvYKqhYak44YMntV1rzGoCVpVEAXUGo
 7Qt0d+FBMcQaSgHBaGFU8AB75Uz3Q2366LDnXbg+prTiBeUpskWqAtAneG792GMuGSeL
 j7xKUGSA5r9RQ7HHvzbuoFyVrAfLQcNEZ+y6IIqeBFFk3HObgEvXy6tfc+sFVH6MYB+H
 XfJo7EZhUPnNeEoLdxAcQD6rqBv6YgaswDrrcQbsN/9irceRs8/8MXkecAMx8DZxMjSW
 KT73bSwHf6RICcUSfW/EBLfK0ruFhYJIjSSC5QisQqWjh0ri4ccJLIjUDbDCQgh+SDLE
 G1FA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
 bh=u38NirGb+SPp0kstCaRyyK9QqwUGMCyYvnGV88SRvzo=;
 b=FkLNRqH08hUfo2o2f5yZRicRj5B9jGT9bKV93UMUWxwONUaI+avWoZ00ytRMnbdK5/
 bnxignWZ5ffc/7b+RckUXtsTrh6wYPc81EeX8QVGIQJydtCzOVKFuPDtmeJiSenCDD65
 NuU7vKLvx5a+7LLOZT+aBJkwNQoCOjv8w396aYE7sC+MI03cbeerTNiTtXWkehHSA3ni
 v6OxawFsZ/HifupPLr5OG12A9xHduXO4q3nF5M2uy87fyY7r2r/HhnsGqZuwcDoyj5Ym
 vz4AaIOaQBnjGRMbvvPC3Lkq/OVcqV3iFtl1H/Hh20D/lEm0p3KuO5RFaWrIdJyukFNh
 lSKQ==
X-Gm-Message-State: AHPjjUgaUPT+YQK9de4eJzlaxxO4aZEUqyzG4OAOaznORxYhVzjp8DSR
 /0jDCzjc9jeu+OQyq/dfHe2oCRe3
X-Received: by 10.55.167.135 with SMTP id q129mr4738278qke.311.1506024009995; 
 Thu, 21 Sep 2017 13:00:09 -0700 (PDT)
Received: from mail-qk0-f178.google.com (mail-qk0-f178.google.com.
 [209.85.220.178])
 by smtp.gmail.com with ESMTPSA id p31sm1606467qtp.12.2017.09.21.13.00.09
 for <freebsd-fs@freebsd.org>
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 21 Sep 2017 13:00:09 -0700 (PDT)
Received: by mail-qk0-f178.google.com with SMTP id j5so6911577qkd.0
 for <freebsd-fs@freebsd.org>; Thu, 21 Sep 2017 13:00:09 -0700 (PDT)
X-Google-Smtp-Source: AOwi7QDcTfElYgGTdbFZj3qb1oVMovtJdbyG4hkpyuqFITV23NlXwC2qKs0aMCuIxOVgpW9Z+MaJJmcCzkS2fCmQbNo=
X-Received: by 10.55.155.203 with SMTP id d194mr4565101qke.288.1506024009006; 
 Thu, 21 Sep 2017 13:00:09 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.12.139.1 with HTTP; Thu, 21 Sep 2017 13:00:08 -0700 (PDT)
From: Wiktor Niesiobedzki <bsd@vink.pl>
Date: Thu, 21 Sep 2017 22:00:08 +0200
X-Gmail-Original-Message-ID: <CAH17caX=10DsuQo3KQnR31B+2M54-yj0RUtz6WQBivqfBokWdw@mail.gmail.com>
Message-ID: <CAH17caX=10DsuQo3KQnR31B+2M54-yj0RUtz6WQBivqfBokWdw@mail.gmail.com>
Subject: ZVOL with volblocksize=64k+ results in data corruption [was Re:
 Resolving errors with ZVOL-s]
To: freebsd-fs <freebsd-fs@freebsd.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Sep 2017 20:00:11 -0000

Hi,

I've conducted additional tests. It looks like when I create volumes with
following commands:
# zfs create -V50g -o volmode=3Ddev -o volblocksize=3D64k -o compression=3D=
off -o
com.sun:auto-snapshot=3Dfalse tank/test
# zfs create -V50g -o volmode=3Ddev -o volblocksize=3D128k -o compression=
=3Doff
-o com.sun:auto-snapshot=3Dfalse tank/test

I'm able to get checksum errors quite reliably in 2-12h of normal work of
the volume. I tested also different volbocksizes:
# zfs create -V50g -o volmode=3Ddev -o volblocksize=3D32k -o compression=3D=
off -o
com.sun:auto-snapshot=3Dfalse tank/test
# zfs create -V50g -o volmode=3Ddev -o volblocksize=3D8k -o compression=3Do=
ff -o
com.sun:auto-snapshot=3Dfalse tank/test
# zfs create -V50g -o volmode=3Ddev -o volblocksize=3D4k -o compression=3Do=
ff -o
com.sun:auto-snapshot=3Dfalse tank/test

And gave them more than 24h of work with no apparent errors (I also moved
other volumes to 4k and they did not show any checksum errors for more than
2 weeks).

I was running with volblocksize=3D128k from January this year. The problem
started to appear only after I updated from 11.0 to 11.1.

Should I file bug report for this? What additional information should I
gather?


Cheers,

Wiktor Niesiob=C4=99dzki

PS. I found a way to solve errors reporting in zpool status. It turns out,
that they disappear after scrub, but only if scrub was run substantially
later than zfs destroy. Maybe some references in ZIL prevent these errors
from being removed? Is this a bug?

2017-09-04 19:12 GMT+02:00 Wiktor Niesiobedzki <bsd@vink.pl>:

> Hi,
>
> I can follow up on my issue - the same problem just happened on the secon=
d
> ZVOL that I've created:
> # zpool status -v
>   pool: tank
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>    see: http://illumos.org/msg/ZFS-8000-8A
>   scan: scrub repaired 0 in 5h27m with 0 errors on Sat Sep  2 15:30:59 20=
17
> config:
>
>         NAME               STATE     READ WRITE CKSUM
>         tank               ONLINE       0     0    14
>           mirror-0         ONLINE       0     0    28
>             gpt/tank1.eli  ONLINE       0     0    28
>             gpt/tank2.eli  ONLINE       0     0    28
>
> errors: Permanent errors have been detected in the following files:
>
>         tank/docker-big:<0x1>
>         <0x5095>:<0x1>
>
>
> I suspect that these errors might be related to my recent upgrade to 11.1=
.
> Until 19 of August I was running 11.0. I consider rolling back to 11.0
> right now.
>
> For reference:
> # zfs get all tank/docker-big
> NAME             PROPERTY               VALUE                  SOURCE
> tank/docker-big  type                   volume                 -
> tank/docker-big  creation               Sat Sep  2 10:09 2017  -
> tank/docker-big  used                   100G                   -
> tank/docker-big  available              747G                   -
> tank/docker-big  referenced             10.5G                  -
> tank/docker-big  compressratio          4.58x                  -
> tank/docker-big  reservation            none                   default
> tank/docker-big  volsize                100G                   local
> tank/docker-big  volblocksize           128K                   -
> tank/docker-big  checksum               skein                  inherited
> from tank
> tank/docker-big  compression            lz4                    inherited
> from tank
> tank/docker-big  readonly               off                    default
> tank/docker-big  copies                 1                      default
> tank/docker-big  refreservation         100G                   local
> tank/docker-big  primarycache           all                    default
> tank/docker-big  secondarycache         all                    default
> tank/docker-big  usedbysnapshots        0                      -
> tank/docker-big  usedbydataset          10.5G                  -
> tank/docker-big  usedbychildren         0                      -
> tank/docker-big  usedbyrefreservation   89.7G                  -
> tank/docker-big  logbias                latency                default
> tank/docker-big  dedup                  off                    default
> tank/docker-big  mlslabel                                      -
> tank/docker-big  sync                   standard               default
> tank/docker-big  refcompressratio       4.58x                  -
> tank/docker-big  written                10.5G                  -
> tank/docker-big  logicalused            47.8G                  -
> tank/docker-big  logicalreferenced      47.8G                  -
> tank/docker-big  volmode                dev                    local
> tank/docker-big  snapshot_limit         none                   default
> tank/docker-big  snapshot_count         none                   default
> tank/docker-big  redundant_metadata     all                    default
> tank/docker-big  com.sun:auto-snapshot  false                  local
>
> Any ideas what should I try before rolling back?
>
>
> Cheers,
>
> Wiktor
>
> 2017-09-02 19:17 GMT+02:00 Wiktor Niesiobedzki <bsd@vink.pl>:
>
>> Hi,
>>
>> I have recently encountered errors on my ZFS Pool on my 11.1-R:
>> $ uname -a
>> FreeBSD kadlubek 11.1-RELEASE-p1 FreeBSD 11.1-RELEASE-p1 #0: Wed Aug  9
>> 11:55:48 UTC 2017     root@amd64-builder.daemonology
>> .net:/usr/obj/usr/src/sys/GENERIC  amd64
>>
>> # zpool status -v tank
>>   pool: tank
>>  state: ONLINE
>> status: One or more devices has experienced an error resulting in data
>>         corruption.  Applications may be affected.
>> action: Restore the file in question if possible.  Otherwise restore the
>>         entire pool from backup.
>>    see: http://illumos.org/msg/ZFS-8000-8A
>>   scan: scrub repaired 0 in 5h27m with 0 errors on Sat Sep  2 15:30:59
>> 2017
>> config:
>>
>>         NAME               STATE     READ WRITE CKSUM
>>         tank               ONLINE       0     0    98
>>           mirror-0         ONLINE       0     0   196
>>             gpt/tank1.eli  ONLINE       0     0   196
>>             gpt/tank2.eli  ONLINE       0     0   196
>>
>> errors: Permanent errors have been detected in the following files:
>>
>>         dkr-test:<0x1>
>>
>> dkr-test is ZVOL that I use within bhyve and indeed - within bhyve I hav=
e
>> noticed I/O errors on this volume. This ZVOL did not have any snapshots.
>>
>> Following the advice mentioned in action I tried to restore the ZVOL:
>> # zfs desroy tank/dkr-test
>>
>> But still errors are mentioned in zpool status:
>> errors: Permanent errors have been detected in the following files:
>>
>>         <0x5095>:<0x1>
>>
>> I can't find any reference to this dataset in zdb:
>>  # zdb -d tank | grep 5095
>>  # zdb -d tank | grep 20629
>>
>>
>> I tried also getting statistics about metadata in this pool:
>> # zdb -b tank
>>
>> Traversing all blocks to verify nothing leaked ...
>>
>> loading space map for vdev 0 of 1, metaslab 159 of 174 ...
>>         No leaks (block sum matches space maps exactly)
>>
>>         bp count:        24426601
>>         ganged count:           0
>>         bp logical:    1983127334912      avg:  81187
>>         bp physical:   1817897247232      avg:  74422     compression:
>> 1.09
>>         bp allocated:  1820446928896      avg:  74527     compression:
>> 1.09
>>         bp deduped:             0    ref>1:      0   deduplication:   1.=
00
>>         SPA allocated: 1820446928896     used: 60.90%
>>
>>         additional, non-pointer bps of type 0:      57981
>>         Dittoed blocks on same vdev: 296490
>>
>> And zdb got stuck using 100% CPU
>>
>> And now to my questions:
>> 1. Do I interpret correctly, that this situation is probably due to erro=
r
>> during write, and both copies of the block got checksum mismatching thei=
r
>> data? And if it is a hardware problem, it is probably something other th=
an
>> disk? (No, I don't use ECC RAM)
>>
>> 2. Is there any way to remove offending dataset and clean the pool of th=
e
>> errors?
>>
>> 3. Is my metadata OK? Or should I restore entire pool from backup?
>>
>> 4. I tried also running zdb -bc tank, but this resulted in kernel panic.
>> I might try to get the stack trace once I get physical access to machine
>> next week. Also - checksum verification slows down process from 1000MB/s=
 to
>> less than 1MB/s. Is this expected?
>>
>> 5. When I work with zdb (as as above) should I try to limit writes to th=
e
>> pool (e.g. by unmounting the datasets)?
>>
>> Cheers,
>>
>> Wiktor Niesiobedzki
>>
>> PS. Sorry for previous partial message.
>>
>>
>