From owner-freebsd-fs@freebsd.org Mon Nov 9 19:40:03 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BB4BDA2ACAB for ; Mon, 9 Nov 2015 19:40:03 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 76F9F1F25 for ; Mon, 9 Nov 2015 19:40:03 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id tA9JPrHV003387; Mon, 9 Nov 2015 13:25:54 -0600 (CST) Date: Mon, 9 Nov 2015 13:25:53 -0600 (CST) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Tim Gustafson cc: freebsd-fs@freebsd.org Subject: Re: ZFS RAID 0+1 Throwing Checksum Errors In-Reply-To: Message-ID: References: User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Mon, 09 Nov 2015 13:25:54 -0600 (CST) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Nov 2015 19:40:03 -0000 On Mon, 9 Nov 2015, Tim Gustafson wrote: > > I'm wondering if the problem is that the scrub is calculating the > checksum for the data on gpt/zfs0, and while that's happening, some > data is updated by Apache or MySQL, and then checksum for the data on > gpt/zfs1 is calculated, which now doesn't match, and therefore the > scrub is reporting an error. Is that possible? This is not possible. ZFS uses Copy On Write (COW) such that existing data blocks and metadata are not overwritten. Data is always written to unused free space. The writes are done as part of a transaction group, and scrub will not see new data until the transaction group is completed. > If that's not it, could this be a bug? Or should I be worried about > my SSDs? What additional data would be helpful for me to share to > diagnose this? It could be the SSDs, the controller, cables, or power supply. The problem might occur when the data is written, or when it is read back. If this is occuring on all of the SSDs then look for some shared component which might be causing the problem. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/