From owner-freebsd-fs@freebsd.org Thu Sep 21 20:00:11 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 93448E1E63A for ; Thu, 21 Sep 2017 20:00:11 +0000 (UTC) (envelope-from bsd@vink.pl) Received: from mail-qk0-x234.google.com (mail-qk0-x234.google.com [IPv6:2607:f8b0:400d:c09::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4DF1367D2B for ; Thu, 21 Sep 2017 20:00:11 +0000 (UTC) (envelope-from bsd@vink.pl) Received: by mail-qk0-x234.google.com with SMTP id t184so6872444qke.10 for ; Thu, 21 Sep 2017 13:00:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vink-pl.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=u38NirGb+SPp0kstCaRyyK9QqwUGMCyYvnGV88SRvzo=; b=IKHL+VficnwtlUSqDYmPLA0KLb4hiqxPxcldvYKqhYak44YMntV1rzGoCVpVEAXUGo 7Qt0d+FBMcQaSgHBaGFU8AB75Uz3Q2366LDnXbg+prTiBeUpskWqAtAneG792GMuGSeL j7xKUGSA5r9RQ7HHvzbuoFyVrAfLQcNEZ+y6IIqeBFFk3HObgEvXy6tfc+sFVH6MYB+H XfJo7EZhUPnNeEoLdxAcQD6rqBv6YgaswDrrcQbsN/9irceRs8/8MXkecAMx8DZxMjSW KT73bSwHf6RICcUSfW/EBLfK0ruFhYJIjSSC5QisQqWjh0ri4ccJLIjUDbDCQgh+SDLE G1FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=u38NirGb+SPp0kstCaRyyK9QqwUGMCyYvnGV88SRvzo=; b=FkLNRqH08hUfo2o2f5yZRicRj5B9jGT9bKV93UMUWxwONUaI+avWoZ00ytRMnbdK5/ bnxignWZ5ffc/7b+RckUXtsTrh6wYPc81EeX8QVGIQJydtCzOVKFuPDtmeJiSenCDD65 NuU7vKLvx5a+7LLOZT+aBJkwNQoCOjv8w396aYE7sC+MI03cbeerTNiTtXWkehHSA3ni v6OxawFsZ/HifupPLr5OG12A9xHduXO4q3nF5M2uy87fyY7r2r/HhnsGqZuwcDoyj5Ym vz4AaIOaQBnjGRMbvvPC3Lkq/OVcqV3iFtl1H/Hh20D/lEm0p3KuO5RFaWrIdJyukFNh lSKQ== X-Gm-Message-State: AHPjjUgaUPT+YQK9de4eJzlaxxO4aZEUqyzG4OAOaznORxYhVzjp8DSR /0jDCzjc9jeu+OQyq/dfHe2oCRe3 X-Received: by 10.55.167.135 with SMTP id q129mr4738278qke.311.1506024009995; Thu, 21 Sep 2017 13:00:09 -0700 (PDT) Received: from mail-qk0-f178.google.com (mail-qk0-f178.google.com. [209.85.220.178]) by smtp.gmail.com with ESMTPSA id p31sm1606467qtp.12.2017.09.21.13.00.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Sep 2017 13:00:09 -0700 (PDT) Received: by mail-qk0-f178.google.com with SMTP id j5so6911577qkd.0 for ; Thu, 21 Sep 2017 13:00:09 -0700 (PDT) X-Google-Smtp-Source: AOwi7QDcTfElYgGTdbFZj3qb1oVMovtJdbyG4hkpyuqFITV23NlXwC2qKs0aMCuIxOVgpW9Z+MaJJmcCzkS2fCmQbNo= X-Received: by 10.55.155.203 with SMTP id d194mr4565101qke.288.1506024009006; Thu, 21 Sep 2017 13:00:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.139.1 with HTTP; Thu, 21 Sep 2017 13:00:08 -0700 (PDT) From: Wiktor Niesiobedzki Date: Thu, 21 Sep 2017 22:00:08 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: ZVOL with volblocksize=64k+ results in data corruption [was Re: Resolving errors with ZVOL-s] To: freebsd-fs Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Sep 2017 20:00:11 -0000 Hi, I've conducted additional tests. It looks like when I create volumes with following commands: # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D64k -o compression=3D= off -o com.sun:auto-snapshot=3Dfalse tank/test # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D128k -o compression= =3Doff -o com.sun:auto-snapshot=3Dfalse tank/test I'm able to get checksum errors quite reliably in 2-12h of normal work of the volume. I tested also different volbocksizes: # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D32k -o compression=3D= off -o com.sun:auto-snapshot=3Dfalse tank/test # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D8k -o compression=3Do= ff -o com.sun:auto-snapshot=3Dfalse tank/test # zfs create -V50g -o volmode=3Ddev -o volblocksize=3D4k -o compression=3Do= ff -o com.sun:auto-snapshot=3Dfalse tank/test And gave them more than 24h of work with no apparent errors (I also moved other volumes to 4k and they did not show any checksum errors for more than 2 weeks). I was running with volblocksize=3D128k from January this year. The problem started to appear only after I updated from 11.0 to 11.1. Should I file bug report for this? What additional information should I gather? Cheers, Wiktor Niesiob=C4=99dzki PS. I found a way to solve errors reporting in zpool status. It turns out, that they disappear after scrub, but only if scrub was run substantially later than zfs destroy. Maybe some references in ZIL prevent these errors from being removed? Is this a bug? 2017-09-04 19:12 GMT+02:00 Wiktor Niesiobedzki : > Hi, > > I can follow up on my issue - the same problem just happened on the secon= d > ZVOL that I've created: > # zpool status -v > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://illumos.org/msg/ZFS-8000-8A > scan: scrub repaired 0 in 5h27m with 0 errors on Sat Sep 2 15:30:59 20= 17 > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 14 > mirror-0 ONLINE 0 0 28 > gpt/tank1.eli ONLINE 0 0 28 > gpt/tank2.eli ONLINE 0 0 28 > > errors: Permanent errors have been detected in the following files: > > tank/docker-big:<0x1> > <0x5095>:<0x1> > > > I suspect that these errors might be related to my recent upgrade to 11.1= . > Until 19 of August I was running 11.0. I consider rolling back to 11.0 > right now. > > For reference: > # zfs get all tank/docker-big > NAME PROPERTY VALUE SOURCE > tank/docker-big type volume - > tank/docker-big creation Sat Sep 2 10:09 2017 - > tank/docker-big used 100G - > tank/docker-big available 747G - > tank/docker-big referenced 10.5G - > tank/docker-big compressratio 4.58x - > tank/docker-big reservation none default > tank/docker-big volsize 100G local > tank/docker-big volblocksize 128K - > tank/docker-big checksum skein inherited > from tank > tank/docker-big compression lz4 inherited > from tank > tank/docker-big readonly off default > tank/docker-big copies 1 default > tank/docker-big refreservation 100G local > tank/docker-big primarycache all default > tank/docker-big secondarycache all default > tank/docker-big usedbysnapshots 0 - > tank/docker-big usedbydataset 10.5G - > tank/docker-big usedbychildren 0 - > tank/docker-big usedbyrefreservation 89.7G - > tank/docker-big logbias latency default > tank/docker-big dedup off default > tank/docker-big mlslabel - > tank/docker-big sync standard default > tank/docker-big refcompressratio 4.58x - > tank/docker-big written 10.5G - > tank/docker-big logicalused 47.8G - > tank/docker-big logicalreferenced 47.8G - > tank/docker-big volmode dev local > tank/docker-big snapshot_limit none default > tank/docker-big snapshot_count none default > tank/docker-big redundant_metadata all default > tank/docker-big com.sun:auto-snapshot false local > > Any ideas what should I try before rolling back? > > > Cheers, > > Wiktor > > 2017-09-02 19:17 GMT+02:00 Wiktor Niesiobedzki : > >> Hi, >> >> I have recently encountered errors on my ZFS Pool on my 11.1-R: >> $ uname -a >> FreeBSD kadlubek 11.1-RELEASE-p1 FreeBSD 11.1-RELEASE-p1 #0: Wed Aug 9 >> 11:55:48 UTC 2017 root@amd64-builder.daemonology >> .net:/usr/obj/usr/src/sys/GENERIC amd64 >> >> # zpool status -v tank >> pool: tank >> state: ONLINE >> status: One or more devices has experienced an error resulting in data >> corruption. Applications may be affected. >> action: Restore the file in question if possible. Otherwise restore the >> entire pool from backup. >> see: http://illumos.org/msg/ZFS-8000-8A >> scan: scrub repaired 0 in 5h27m with 0 errors on Sat Sep 2 15:30:59 >> 2017 >> config: >> >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 98 >> mirror-0 ONLINE 0 0 196 >> gpt/tank1.eli ONLINE 0 0 196 >> gpt/tank2.eli ONLINE 0 0 196 >> >> errors: Permanent errors have been detected in the following files: >> >> dkr-test:<0x1> >> >> dkr-test is ZVOL that I use within bhyve and indeed - within bhyve I hav= e >> noticed I/O errors on this volume. This ZVOL did not have any snapshots. >> >> Following the advice mentioned in action I tried to restore the ZVOL: >> # zfs desroy tank/dkr-test >> >> But still errors are mentioned in zpool status: >> errors: Permanent errors have been detected in the following files: >> >> <0x5095>:<0x1> >> >> I can't find any reference to this dataset in zdb: >> # zdb -d tank | grep 5095 >> # zdb -d tank | grep 20629 >> >> >> I tried also getting statistics about metadata in this pool: >> # zdb -b tank >> >> Traversing all blocks to verify nothing leaked ... >> >> loading space map for vdev 0 of 1, metaslab 159 of 174 ... >> No leaks (block sum matches space maps exactly) >> >> bp count: 24426601 >> ganged count: 0 >> bp logical: 1983127334912 avg: 81187 >> bp physical: 1817897247232 avg: 74422 compression: >> 1.09 >> bp allocated: 1820446928896 avg: 74527 compression: >> 1.09 >> bp deduped: 0 ref>1: 0 deduplication: 1.= 00 >> SPA allocated: 1820446928896 used: 60.90% >> >> additional, non-pointer bps of type 0: 57981 >> Dittoed blocks on same vdev: 296490 >> >> And zdb got stuck using 100% CPU >> >> And now to my questions: >> 1. Do I interpret correctly, that this situation is probably due to erro= r >> during write, and both copies of the block got checksum mismatching thei= r >> data? And if it is a hardware problem, it is probably something other th= an >> disk? (No, I don't use ECC RAM) >> >> 2. Is there any way to remove offending dataset and clean the pool of th= e >> errors? >> >> 3. Is my metadata OK? Or should I restore entire pool from backup? >> >> 4. I tried also running zdb -bc tank, but this resulted in kernel panic. >> I might try to get the stack trace once I get physical access to machine >> next week. Also - checksum verification slows down process from 1000MB/s= to >> less than 1MB/s. Is this expected? >> >> 5. When I work with zdb (as as above) should I try to limit writes to th= e >> pool (e.g. by unmounting the datasets)? >> >> Cheers, >> >> Wiktor Niesiobedzki >> >> PS. Sorry for previous partial message. >> >> >