From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 11:12:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2CF77807; Mon, 21 Jan 2013 11:12:56 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 983D8930; Mon, 21 Jan 2013 11:12:55 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0LBCjTd001085; Mon, 21 Jan 2013 12:12:45 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0LBCjTT001082; Mon, 21 Jan 2013 12:12:45 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Mon, 21 Jan 2013 12:12:45 +0100 (CET) From: Wojciech Puchar To: Zaphod Beeblebrox Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Mon, 21 Jan 2013 12:12:46 +0100 (CET) Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 11:12:56 -0000 > Please don't misinterpret this post: ZFS's ability to recover from fairly > catastrophic failures is pretty stellar, but I'm wondering if there can be from my testing it is exactly opposite. You have to see a difference between marketing and reality. > a little room for improvement. > > I use RAID pretty much everywhere. I don't like to loose data and disks > are cheap. I have a fair amount of experience with all flavors ... and ZFS just like me. And because i want performance and - as you described - disks are cheap - i use RAID-1 (gmirror). > has become a go-to filesystem for most of my applications. My applications doesn't tolerate low performance, overcomplexity and high risk of data loss. That's why i use properly tuned UFS, gmirror, and prefer not to use gstripe but have multiple filesystems > One of the best recommendations I can give for ZFS is it's > crash-recoverability. Which is marketing, not truth. If you want bullet-proof recoverability, UFS beats everything i've ever seen. If you want FAST crash recovery, use softupdates+journal, available in FreeBSD 9. > As a counter example, if you have most hardware RAID > going or a software whole-disk raid, after a crash it will generally > declare one disk as good and the other disk as "to be repaired" ... after > which a full surface scan of the affected disks --- reading one and writing > the other --- ensues. true. gmirror do it, but you can defer mirror rebuild, which i use. I have a script that send me a mail when gmirror is degraded, and i - after finding out the cause of problem, and possibly replacing disk - run rebuild after work hours, so no slowdown is experienced. > ZFS is smart on this point: it will recover on reboot with a minimum amount > of fuss. Even if you dislodge a drive ... so that it's missing the last > 'n' transactions, ZFS seems to figure this out (which I thought was extra > cudos). Yes this is marketing. practice is somehow different. as you discovered yourself. > > MY PROBLEM comes from problems that scrub can fix. > > Let's talk, in specific, about my home array. It has 9x 1.5T and 8x 2T in > a RAID-Z configuration (2 sets, obviously). While RAID-Z is already a king of bad performance, i assume you mean two POOLS, not 2 RAID-Z sets. if you mixed 2 different RAID-Z pools you would spread load unevenly and make performance even worse. > > A full scrub of my drives weighs in at 36 hours or so. which is funny as ZFS is marketed as doing this efficient (like checking only used space). dd if=/dev/disk of=/dev/null bs=2m would take no more than a few hours. and you may do all in parallel. > vr2/cvs:<0x1c1> > > Now ... this is just an example: after each scrub, the hex number was seems like scrub simply not do it's work right. > before the old error was cleared. Then this new error gets similarly > cleared by the next scrub. It seems that if the scrub returned to this new > found error after fixing the "known" errors, this could save whole new > scrub runs from being required. Even better - use UFS. For both bullet proof recoverability and performance. If you need help in tuning you may ask me privately.