From owner-freebsd-questions@freebsd.org Tue Jun 25 15:54:06 2019 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B0A4315D0257 for ; Tue, 25 Jun 2019 15:54:06 +0000 (UTC) (envelope-from jmc-freebsd2@milibyte.co.uk) Received: from outmx-004.london.gridhost.co.uk (outmx-004.london.gridhost.co.uk [95.142.156.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 63E3B69F08 for ; Tue, 25 Jun 2019 15:54:05 +0000 (UTC) (envelope-from jmc-freebsd2@milibyte.co.uk) Received: from curlew.milibyte.co.uk (unknown [82.71.56.121]) (Authenticated sender: mailpool@milibyte.co.uk) by outmx-004.london.gridhost.co.uk (Postfix) with ESMTPA id B0B7423ED1792 for ; Tue, 25 Jun 2019 16:53:56 +0100 (BST) Received: from [127.0.0.1] (helo=curlew.localnet) by curlew.milibyte.co.uk with esmtp (Exim 4.92) (envelope-from ) id 1hfnlc-0000Tj-Ba for freebsd-questions@freebsd.org; Tue, 25 Jun 2019 16:53:56 +0100 From: Mike Clarke To: FreeBSD questions Subject: Confused by zfs errors Date: Tue, 25 Jun 2019 16:53:56 +0100 Message-ID: <2445405.ffieBuXMo3@curlew> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: jmc-freebsd2@milibyte.co.uk X-SA-Exim-Scanned: No (on curlew.milibyte.co.uk); SAEximRunCond expanded to false X-Rspamd-Queue-Id: 63E3B69F08 X-Spamd-Bar: ++ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of jmc-freebsd2@milibyte.co.uk designates 95.142.156.27 as permitted sender) smtp.mailfrom=jmc-freebsd2@milibyte.co.uk X-Spamd-Result: default: False [2.28 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; DMARC_NA(0.00)[milibyte.co.uk]; NEURAL_SPAM_MEDIUM(0.35)[0.354,0]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-0.05)[-0.054,0]; RCVD_COUNT_THREE(0.00)[3]; MIME_TRACE(0.00)[0:+,1:+]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[mail3.eqx.gridhost.co.uk]; NEURAL_SPAM_SHORT(0.90)[0.899,0]; RCVD_TLS_LAST(0.00)[]; RCVD_IN_DNSWL_LOW(-0.10)[27.156.142.95.list.dnswl.org : 127.0.5.1]; R_DKIM_NA(0.00)[]; CTE_CASE(0.50)[]; ASN(0.00)[asn:198047, ipnet:95.142.156.0/22, country:GB]; MID_RHS_NOT_FQDN(0.50)[]; IP_SCORE(0.49)[ipnet: 95.142.156.0/22(1.02), asn: 198047(1.54), country: GB(-0.09)]; FROM_EQ_ENVFROM(0.00)[] Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7Bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jun 2019 15:54:07 -0000 I'm using zfs on FreeBSD 12.0-RELEASE-p4 GENERIC amd64 on a desktop system which is shut down each night and rebooted each morning and my daily periodic scripts are reporting some filesystem errors which I am unable to fix and are somewhat confusing. /etc/periodic/security/100.chksetuid is reporting: --------------------------------------------------- Checking setuid files and devices: find: /home/liz/Maildir/cur/1342434798.M711754P2579.curlew.lan,S=82312,W=83431:2,S: Unknown error: 122 find: /home/mike/Maildir/cur/1354984767.M156539P5390.curlew.lan,S=217133,W=220003:2,RS: Unknown error: 122 find: /home/mike/Maildir/cur/1387550678.M716573P2948.curlew.lan,S=99139,W=101030:2,S: Unknown error: 122 find: /home/mike/mp3/tapes/Wind Music of Holst & Vaughan Williams: Unknown error: 122 --------------------------------------------------- And /etc/periodic/daily/404.status-zfs is reporting: --------------------------------------------------- Checking status of zfs pools: NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT ssd 119G 48.1G 70.9G - - 32% 40% 1.00x ONLINE - sys 460G 314G 146G - - 48% 68% 1.00x ONLINE - pool: sys state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 0 days 01:28:12 with 0 errors on Wed Jun 5 11:01:47 2019 config: NAME STATE READ WRITE CKSUM sys ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/sys2 ONLINE 0 0 0 gpt/sys1 ONLINE 0 0 0 errors: 1 data errors, use '-v' for a list --------------------------------------------------- I've checked the memory with Memtest86 which reports some errors in the hammer test which I imagine could be the cause of the filesystem corruption and I'm in the process of raising a ticket to replace the memory modules under warranty but in the meantime I need to try to fix the errors in the filesystem. The first problem is that I can't fix the "Unknown error: 122" message for the mp3 directory and the 3 mail files because if I try to delete them or copy my backup copies into them I just get another 122 error. When I run zpool status -v sys I get the following: --------------------------------------------------- pool: sys state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 0 days 01:28:12 with 0 errors on Wed Jun 5 11:01:47 2019 config: NAME STATE READ WRITE CKSUM sys ONLINE 0 0 15 mirror-0 ONLINE 0 0 60 gpt/sys2 ONLINE 0 0 60 gpt/sys1 ONLINE 0 0 60 errors: Permanent errors have been detected in the following files: sys/DATA/home:<0x0> Can I resolve the sys/DATA/home:<0x0> issue without destroying the entire pool and restoring from backup? Yes I do have a full backup which is free from these errors but I'd prefer to avoid deleting everything unless I really have to. The above zpool status from the command line is reporting CKSUM errors which the periodic script reports as all zeros. I've checked this over a number of days and the script always reports zeros while checks from the command line always give a number of CKSUM errors which vary (up and down) from day to day. I also see that if I run zpool scrub without the -v option as a normal user it reports "errors: 2 data errors, use '-v' for a list" but when I run it as root it only reports 1 data error. The errors first occurred before I ran zpool scrub on June 5 but scrub was not able to repair them. -- Mike Clarke