From owner-freebsd-fs@FreeBSD.ORG Thu Aug 28 18:08:02 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 73EB77B2 for ; Thu, 28 Aug 2014 18:08:02 +0000 (UTC) Received: from smtp5.cc.ksu.edu (smtp-feca09d8910ed08e5e479c79aac3e744.cc.ksu.edu [129.130.255.125]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 512F41AA4 for ; Thu, 28 Aug 2014 18:08:01 +0000 (UTC) Received: from mew.cns.ksu.edu (mew.cns.ksu.edu [129.130.0.181]) by smtp5.cc.ksu.edu (8.14.3/8.14.3) with ESMTP id s7SI7vuS014337 for ; Thu, 28 Aug 2014 13:07:59 -0500 (CDT) Message-ID: <53FF6FFC.2090407@ksu.edu> Date: Thu, 28 Aug 2014 13:07:56 -0500 From: "Lawrence K. Chen, P.Eng." Organization: Kansas State University - ITS/Enterprise Server Technologies User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: freebsd-fs@FreeBSD.org Subject: Resilver ZIL & Sequential Resilvering? Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.98.4 at cts-virus3 X-Virus-Status: Clean X-Scanned-By: MIMEDefang 2.72 on 129.130.255.119 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2014 18:08:02 -0000 As I'm suffering through the process of watching ZFS resilver....I wondered if there are any updates to ZFS coming down that help in these areas? The first is Resilvering a ZIL scan: resilver in progress since Thu Aug 28 08:35:38 2014 996G scanned out of 1.15T at 75.7M/s, 0h41m to go 0 resilvered, 84.33% done ----------- scan: resilver in progress since Thu Aug 28 08:37:12 2014 3.80T scanned out of 4.92T at 298M/s, 1h5m to go 0 resilvered, 77.24% done ----------- scan: resilver in progress since Thu Aug 28 08:37:49 2014 7.83T scanned out of 9.41T at 616M/s, 0h44m to go 0 resilvered, 83.23% done Replaced one of the SSDs for my mirrored ZIL this morning. Why does it have to scan all the storage in the main pool to resilver the ZIL? Why does it have to resilver the ZIL at all? Granted it goes a whole lot faster than if I were replacing a disk in one of the pools (which I recently did, twice, for the top pool.... when it started it had estimates of like 400+ hours, though reality was ~60 hours. Which brings me to the second item.... sequential resilvering? We were way behind on updates for our 7420, and I happened to spot this as a new feature. The gist is that all the random i/o, especially at the beginning makes resilvering slow...so this enhancement splits the resilvering into two steps, in the first step it scans all the blocks that need to be copied and sorts them into LBA order.... In the meantime...more slow resilvering is on my horizon. Since replaced one SSD with a larger SSD, I'm going to want replace the other. Plus I'm thinking of migrating to a new root pool, to see if it'll rid me of the "ZFS: i/o error - all block copies unavailable" messages during boot....and make some layout changes. Plus need to see about getting the first pool expand to its new size....someday I'll need to figure something out for the other two pools (there's one drive about to go in the second pool...) All the harddrives were 512 byte sectors, but only the first pool (a pair of ST31500341AS drives) was made with ashift=12. So, when one drive reported imminent failure with the sudden relocation of 3000+ sectors....when grew to 4000+ the next day, and got replaced the next...on a Sunday. But, it wasn't that big a problem as I had a pair of 3TB WD Reds that were supposed to go somewhere else. Second pool is raidz of similar 1.5TB drives, third pool is raidz of Hitachi 2TB drives....so far its only been the 1.5TB drives that keep leaving me. I suspect at the time, I expected the drives in the first pool to fail and that I would be upgrading to bigger 4K drives, while the other two pools....well...the pool of 1.5TB was temporary, only its not anymore....plus I still have a few extras from other failed arrays. And, the 2TB pool was probably because I knew it was going to have lots of tiny files, etc. The overhead of 4K versus 512 is pretty huge.... Originally created my /poudriere space on the second pool....a ports directory was about 450MB. Moved things over to the first pool....the same directories are now about 840MB. I guess things could've been worse.... Had noticed similar things when I changed the page size of an sqlite file from 1k to 4k.... -- Who: Lawrence K. Chen, P.Eng. - W0LKC - Sr. Unix Systems Administrator For: Enterprise Server Technologies (EST) -- & SafeZone Ally