From owner-freebsd-scsi@FreeBSD.ORG Thu Jul 10 10:38:58 2014 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2FEADDBC for ; Thu, 10 Jul 2014 10:38:58 +0000 (UTC) Received: from gw.zefyris.com (sabik.zefyris.com [IPv6:2001:7a8:3c67:2::254]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B558B2B6A for ; Thu, 10 Jul 2014 10:38:57 +0000 (UTC) Received: from sekishi.zefyris.com (sekishi.zefyris.com [IPv6:2001:7a8:3c67:2::12]) by gw.zefyris.com (8.14.5/8.14.5) with ESMTP id s6AAcreR010577; Thu, 10 Jul 2014 12:38:53 +0200 (CEST) Date: Thu, 10 Jul 2014 12:38:53 +0200 From: Francois Tigeot To: Steven Hartland Subject: Re: Data corruption with the mfi(4) driver Message-ID: <20140710103853.GC1206@sekishi.zefyris.com> References: <20140710092251.GA1206@sekishi.zefyris.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (gw.zefyris.com [IPv6:2001:7a8:3c67:2::254]); Thu, 10 Jul 2014 12:38:53 +0200 (CEST) Cc: FreeBSD-scsi X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jul 2014 10:38:58 -0000 On Thu, Jul 10, 2014 at 11:20:38AM +0100, Steven Hartland wrote: > I cant see any information on the actual corruption or cause in that linked > thread do you have any actual details? > > There was known corruption issues but these where fixed long ago so would > be good to confirm the details of what you where running and the HW when you > had the issue. > > As a point of reference we have mfi backed DB machines here and have not > had any issues with corruption and they have been in production for over > 1 1/2 years. It is only visible with recent adapters like the Thunderbolt serie, and then under relatively high disk load. The whole Dell Rx20 generation of servers seem to be impacted; the previous Rx10 generation is safe. This bug report contains additional details as well as PCI ids from two different Dell machines having experienced filesystem destruction: http://bugs.dragonflybsd.org/issues/2683 HAMMER CRC32 errors were reported on the console and the kernel eventually crashed after some time; I didn't get crash dumps. -- Francois Tigeot