From owner-freebsd-fs@freebsd.org Thu Aug 11 09:50:24 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 459CFBB5705 for ; Thu, 11 Aug 2016 09:50:24 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from cu01176a.smtpx.saremail.com (cu01176a.smtpx.saremail.com [195.16.150.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EABDD1C3B for ; Thu, 11 Aug 2016 09:50:23 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from [172.16.8.36] (izaro.sarenet.es [192.148.167.11]) by proxypop03.sare.net (Postfix) with ESMTPSA id E4F879DCCCA; Thu, 11 Aug 2016 11:43:41 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: HAST + ZFS + NFS + CARP From: Borja Marcos In-Reply-To: <226B5D47-72AF-4325-9A7D-9D6356C4D463@gmail.com> Date: Thu, 11 Aug 2016 11:43:41 +0200 Cc: freebsd-fs@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <93B4257C-5EFC-4304-A7F9-5E8BFA7792FC@sarenet.es> References: <6035AB85-8E62-4F0A-9FA8-125B31A7A387@gmail.com> <20160703192945.GE41276@mordor.lan> <20160703214723.GF41276@mordor.lan> <65906F84-CFFC-40E9-8236-56AFB6BE2DE1@ixsystems.com> <61283600-A41A-4A8A-92F9-7FAFF54DD175@ixsystems.com> <20160704183643.GI41276@mordor.lan> <20160704193131.GJ41276@mordor.lan> <20160811091016.GI70364@mordor.lan> <1AA52221-9B04-4CF6-97A3-D2C2B330B7F9@sarenet.es> <226B5D47-72AF-4325-9A7D-9D6356C4D463@gmail.com> To: Ben RUBSON X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Aug 2016 09:50:24 -0000 > On 11 Aug 2016, at 11:39, Ben RUBSON wrote: >=20 >=20 >> On 11 Aug 2016, at 11:24, Borja Marcos wrote: >>=20 >> Although, frankly, >> ZFS is extremely resilient. One of mine even survived a SAS HBA = problem that caused some >> silent corruption. >=20 > Any link to this issue Borja ? > Thank you ! It wasn=E2=80=99t a FreeBSD or ZFS bug, but a defective part (a HBA). = Once in a while we saw some errors in /var/log/messages and zfs scrub revealed some corruption that ZFS fixed without issues. = Determining the cause wasn=E2=80=99t easy (at first it looked like a defective backplane) and IBM, who are no longer welcome here = thanks to their totally fabulous support and warranty policy, didn=E2=80=99t help much. So we took the system offline, using = the replicated server instead, and it took some time doing tests (during which we caused more silent corrption which ZFS fixed without = problems) to determine that it was indeed the HBA. Finally we replaced the HBA and the system is back at work. But not a = single bit was lost. Borja.