From owner-freebsd-stable@freebsd.org Thu Jun 29 12:04:24 2017 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4F715D9BE51 for ; Thu, 29 Jun 2017 12:04:24 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from elf.hq.norma.perm.ru (mail.norma.perm.ru [IPv6:2a00:7540:1::5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.norma.perm.ru", Issuer "Vivat-Trade UNIX Root CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B2D7F8208F for ; Thu, 29 Jun 2017 12:04:23 +0000 (UTC) (envelope-from emz@norma.perm.ru) Received: from bsdrookie.norma.com. (net206-94.perm.ertelecom.ru [46.146.206.94] (may be forged)) by elf.hq.norma.perm.ru (8.15.2/8.15.2) with ESMTPS id v5TC4GG0046862 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Thu, 29 Jun 2017 17:04:17 +0500 (YEKT) (envelope-from emz@norma.perm.ru) Subject: Re: redundant zfs pool, system traps and tonns of corrupted files To: freebsd-stable@freebsd.org References: From: "Eugene M. Zheganin" Message-ID: <3c4044c5-9016-80ce-1302-2546c76f0dd4@norma.perm.ru> Date: Thu, 29 Jun 2017 17:04:16 +0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Spamd-Result: default: False [1.50 / 25.00] BAYES_HAM(-3.00)[100.00%] RBL_SPAMHAUS_PBL(2.00)[94.206.146.46.zen.spamhaus.org : 127.0.0.10] HFILTER_HOSTNAME_UNKNOWN(2.50)[] DMARC_NA(0.00)[norma.perm.ru] MIME_GOOD(-0.10)[text/plain] R_DKIM_NA(0.00)[] R_SPF_SOFTFAIL(0.00)[~all] RCPT_COUNT_1(0.00)[] MID_RHS_MATCH_FROM(0.00)[] RECEIVED_SPAMHAUS(0.00)[94.206.146.46.zen.spamhaus.org] TO_MATCH_ENVRCPT_ALL(0.00)[] FROM_HAS_DN(0.00)[] TO_DN_NONE(0.00)[] FROM_EQ_ENVFROM(0.00)[] RCVD_COUNT_1(0.00)[] ONCE_RECEIVED(0.10)[] X-Rspamd-Server: localhost X-Rspamd-Scan-Time: 3.26 X-Rspamd-Queue-ID: v5TC4GG0046862 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jun 2017 12:04:24 -0000 Hi, On 29.06.2017 16:37, Eugene M. Zheganin wrote: > Hi. > > > Say I'm having a server that traps more and more often (different > panics: zfs panics, GPFs, fatal traps while in kernel mode etc), and > then I realize it has tonns of permanent errors on all of it's pools > that scrub is unable to heal. Does this situation mean it's a bad > memory case ? Unfortunately I switched the hardware to an identical > server prior to encountering zpools have errors, so I'm not use when > did they appear. Right now I'm about to run a memtest on an old hardware. > > > So, whadda you say - does it point at the memory as the root problem ? > I'm also not quite getting the situation when I have errors on a vdev level, but 0 errors on a lower device layer (could someone please explain this): pool: esx state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: resilvered 3,74G in 0h5m with 0 errors on Tue Dec 27 05:14:32 2016 config: NAME STATE READ WRITE CKSUM esx ONLINE 0 0 99,0K raidz1-0 ONLINE 0 0 113K da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 da2 ONLINE 0 0 2 da3 ONLINE 0 0 0 da5 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 84,7K da12 ONLINE 0 0 0 da13 ONLINE 0 0 1 da14 ONLINE 0 0 0 da15 ONLINE 0 0 0 da16 ONLINE 0 0 0 errors: 25 data errors, use '-v' for a list pool: gamestop state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub in progress since Thu Jun 29 12:30:21 2017 1,67T scanned out of 4,58T at 1002M/s, 0h50m to go 0 repaired, 36,44% done config: NAME STATE READ WRITE CKSUM gamestop ONLINE 0 0 1 raidz1-0 ONLINE 0 0 2 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 da11 ONLINE 0 0 0 errors: 10 data errors, use '-v' for a list P.S. This is a FreeBSD 11.1-BETA2 r320056M (M stands for CTL_MAX_PORTS = 1024), with ECC memory. Thanks. Eugene.