From owner-freebsd-stable@FreeBSD.ORG Thu Sep 11 00:51:58 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B0E22A29 for ; Thu, 11 Sep 2014 00:51:58 +0000 (UTC) Received: from ipmail04.adl6.internode.on.net (ipmail04.adl6.internode.on.net [150.101.137.141]) by mx1.freebsd.org (Postfix) with ESMTP id 439A5314 for ; Thu, 11 Sep 2014 00:51:57 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAD3wEFQ7p/kP/2dsb2JhbABgg2DSbgGBJ3iEBAEFMgFFDwILDgoJFg8JAwIBAgEJPAYBDAgBAYg9v2cEhXiJWIRMBZV5iGKKSYkRgWcegW6DKQEBAQ Received: from eth4368.nsw.adsl.internode.on.net (HELO fish.ish.com.au) ([59.167.249.15]) by ipmail04.adl6.internode.on.net with ESMTP; 11 Sep 2014 10:15:42 +0930 Received: from ip-136.ish.com.au ([203.29.62.136]:56286) by fish.ish.com.au with esmtpsa (UNKNOWN:AES128-SHA:128) (Exim 4.76) (envelope-from ) id 1XRsW3-0000Zv-0Y; Thu, 11 Sep 2014 10:45:39 +1000 Message-ID: <5410F0B4.9040808@ish.com.au> Date: Thu, 11 Sep 2014 10:45:40 +1000 From: Aristedes Maniatis User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:32.0) Gecko/20100101 Thunderbird/32.0 MIME-Version: 1.0 To: Stefan Esser , freebsd-stable Subject: Re: getting to 4K disk blocks in ZFS References: <540FF3C4.6010305@ish.com.au> <54100258.2000505@freebsd.org> In-Reply-To: <54100258.2000505@freebsd.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Sep 2014 00:51:58 -0000 Thanks Stefan and Peter for the highly informative posts. On 10/09/2014 5:48pm, Stefan Esser wrote: > ZFS uses variable block sizes by breaking down large blocks to smaller > fragments as suitable for the data to be stored. The largest block to > be used is configurable (128 KByte by default) and the smallest fragment > is the sector size (i.e. 512 or 4096 bytes), as configured by "ashift". So this means that the ZFS developers would need to effectively (re)fragment the entire pool if they wanted to develop a way to increase the ashift size. This sounds like something that isn't going to be solved in the near future (less than three years) if it is a similar technical problem to inserting another disk into an existing vdev. And that means that as it becomes harder to buy older 512 byte disks, everyone with a ZFS pool is going to be stuck with managing quite a lot of downtime as they upgrade. And even more pain if they boot off that pool. On 10/09/2014 4:51pm, Peter Wemm wrote: > For what its worth, in the freebsd.org cluster we automatically align > everything to a minimum of 4k, no matter what the actual drive is. > > We set: sysctl vfs.zfs.min_auto_ashift=12 > (this saves a lot of messing around with gnop etc) > > and ensure all the gpt slices are 4k or better aligned. Should the FreeBSD project change this minimum in the next release? There seems to be no downside and a huge amount of pain for people who stumble along with the defaults not knowing what a mess they are creating to solve later. Cheers Ari -- --------------------------> Aristedes Maniatis ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A