From owner-freebsd-fs@FreeBSD.ORG Wed Jul 17 17:35:27 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id F37BCB85 for ; Wed, 17 Jul 2013 17:35:26 +0000 (UTC) (envelope-from wblock@wonkity.com) Received: from wonkity.com (wonkity.com [67.158.26.137]) by mx1.freebsd.org (Postfix) with ESMTP id C20D6F60 for ; Wed, 17 Jul 2013 17:35:26 +0000 (UTC) Received: from wonkity.com (localhost [127.0.0.1]) by wonkity.com (8.14.7/8.14.7) with ESMTP id r6HHZJk2091667; Wed, 17 Jul 2013 11:35:19 -0600 (MDT) (envelope-from wblock@wonkity.com) Received: from localhost (wblock@localhost) by wonkity.com (8.14.7/8.14.7/Submit) with ESMTP id r6HHZJjE091664; Wed, 17 Jul 2013 11:35:19 -0600 (MDT) (envelope-from wblock@wonkity.com) Date: Wed, 17 Jul 2013 11:35:19 -0600 (MDT) From: Warren Block To: =?ISO-8859-15?Q?Gezeala_M=2E_Bacu=F1o_II?= Subject: Re: Slow resilvering with mirrored ZIL In-Reply-To: Message-ID: References: <2EF46A8C-6908-4160-BF99-EC610B3EA771@alumni.chalmers.se> <51D437E2.4060101@digsys.bg> <20130704000405.GA75529@icarus.home.lan> <20130704171637.GA94539@icarus.home.lan> <2A261BEA-4452-4F6A-8EFB-90A54D79CBB9@alumni.chalmers.se> <20130704191203.GA95642@icarus.home.lan> <43015E9015084CA6BAC6978F39D22E8B@multiplay.co.uk> <3CFB4564D8EB4A6A9BCE2AFCC5B6E400@multiplay.co.uk> <51D6A206.2020303@digsys.bg> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="3512871622-236036210-1374082519=:91446" X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (wonkity.com [127.0.0.1]); Wed, 17 Jul 2013 11:35:20 -0600 (MDT) Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jul 2013 17:35:27 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --3512871622-236036210-1374082519=:91446 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8BIT On Wed, 17 Jul 2013, Gezeala M. Bacuņo II wrote: > If ZFS goes on a bare drive, it will be aligned by default.  If ZFS is going in a partition, yes, align that partition to 4K boundaries or larger multiples of 4K, like 1M. > > Your statement is enlightening and concise, exactly what I need. Thanks. > > The gnop/ashift workaround is just to get ZFS to use the right block size.  So if you don't take care to get partition alignment right, you might end up using the right > block size but misaligned. > > And yes, it will be nice to be able to just explicitly tell ZFS the block size to use. > > > We do add the entire drive (no partitions) to ZFS, perform gnop/ashift and other necessary steps and then verify ashift=12 through zdb. > > The gpart/gnop/ashift steps, if I understand correctly (do correct me if I'm stating this incorrectly), is needed for further SSD performance tuning. Taking into consideration leaving a > certain chunk for wear leveling and also if the SSD has a size that may be too big for L2ARC. Well, there are several things going on. Partitions can be used for a couple of things. Limiting the size of space available to ZFS, leaving an unallocated part of the drive for wear leveling. Note that ZFS on FreeBSD now has TRIM, which should make leaving unused space on SSDs unnecessary. Aligning partitions preserves performance. If a partition is misaligned, writes can slow down to half speed. For example, a 4K filesystem block written to an aligned partition writes a single block. If the partition is misaligned, that 4K write is split over two disk blocks. Each block has to be read, partly modified, then written, taking roughly twice as long. Finally, ZFS's ashift controls the minimum size of block ZFS uses. ashift=12 (12 bits) sets that to 4K blocks (2^12=4096). Again, a performance thing, matching the filesystem block size to device block size. It would be interesting to see a benchmark of ZFS on a 4K drive with different ashift values. --3512871622-236036210-1374082519=:91446--