From owner-freebsd-hackers@FreeBSD.ORG Sun Jun 1 16:07:54 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 12AC6413; Sun, 1 Jun 2014 16:07:54 +0000 (UTC) Received: from i3mail.icecube.wisc.edu (i3mail.icecube.wisc.edu [128.104.255.23]) by mx1.freebsd.org (Postfix) with ESMTP id D801B2887; Sun, 1 Jun 2014 16:07:53 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by i3mail.icecube.wisc.edu (Postfix) with ESMTP id 52A6F38070; Sun, 1 Jun 2014 11:07:53 -0500 (CDT) X-Virus-Scanned: amavisd-new at icecube.wisc.edu Received: from i3mail.icecube.wisc.edu ([127.0.0.1]) by localhost (i3mail.icecube.wisc.edu [127.0.0.1]) (amavisd-new, port 10030) with ESMTP id 8QBNQY1VbJVa; Sun, 1 Jun 2014 11:07:53 -0500 (CDT) Received: from comporellon.tachypleus.net (polaris.tachypleus.net [75.101.50.44]) by i3mail.icecube.wisc.edu (Postfix) with ESMTPSA id D10EE3805A; Sun, 1 Jun 2014 11:07:52 -0500 (CDT) Message-ID: <538B4FD7.4090000@freebsd.org> Date: Sun, 01 Jun 2014 09:07:51 -0700 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Steven Hartland , freebsd-hackers@freebsd.org, freebsd-fs@freebsd.org Subject: Re: fdisk(8) vs gpart(8), and gnop References: <20140601004242.GA97224@bewilderbeast.blackhelicopters.org> <3D6974D83AE9495E890D9F3CA654FA94@multiplay.co.uk> <538B4CEF.2030801@freebsd.org> <1DB2D63312CE439A96B23EAADFA9436E@multiplay.co.uk> In-Reply-To: <1DB2D63312CE439A96B23EAADFA9436E@multiplay.co.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Jun 2014 16:07:54 -0000 On 06/01/14 09:00, Steven Hartland wrote: > > ----- Original Message ----- From: "Nathan Whitehorn" > > To: ; > Sent: Sunday, June 01, 2014 4:55 PM > Subject: Re: fdisk(8) vs gpart(8), and gnop > > >> On 06/01/14 08:52, Steven Hartland wrote: >>> ----- Original Message ----- From: "Mark Felder" >>> >>>> On May 31, 2014, at 20:57, Freddie Cash wrote: >>>> >>>>> There's a sysctl where you can set the minimum ashift for zfs. >>>>> Then you >>>>> never need to use gnop. >>>>> >>>>> I believe it's part of 10.0? >>>> >>>> I've not seen this yet. What we need is to port the ability to set >>>> ashift at pool creation time: >>>> >>>> $ zpool create -o ashift=12 tank mirror disk1 disk2 mirror disk3 disk4 >>>> >>>> I believe the Linux zfs port has this functionality now, but we >>>> still do not. >>> >>> We don't have that direct option yet but you can achieve the >>> same thing by setting: vfs.zfs.min_auto_ashift=12 >>> >> Does anyone have any objections to me changing this default, right >> now, today? >> -Nathan > > I think you will get some objections to that, as it can have quite an > impact > on the performance for disks which are 512, due to the increased > overhead of > transfering 4k when only 512 is really required. This has a more dramatic > impact on RAIDZx due too. > > Personally we run a custom kernel on our machines which has just this > change > in it to ensure capability with future disks, so I can confirm it does > indeed > have the desired effect :) So the discussion here is related to what to do about the installer. The current ZFS component unconditionally creates gnops all over the place to set ashift to 4k. That's across the board worse: it has exactly the performance impact of changing the default of this sysctl (whatever that is), it can't easily be overridden (which the sysctl can), and it's a horrible hack to boot. There are a few options: 1. Change the default of vfs.zfs.min_auto_ashift 2. Have the same effect but in a vastly worse way by adjusting the installer to create gnops 3. Have ZFS choose by itself and decide to do that permanently. Our ATA code is good about reporting block sizes now, so (3) isn't a big issue except for the mixed-pool case, which is a huge PITA. We need to choose one of these. I favor (1). -Nathan