Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 02 Jun 2014 08:02:31 -0700
From:      Nathan Whitehorn <nwhitehorn@freebsd.org>
To:        Matthew Ahrens <mahrens@delphix.com>
Cc:        freebsd-fs <freebsd-fs@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: fdisk(8) vs gpart(8), and gnop
Message-ID:  <538C9207.9040806@freebsd.org>
In-Reply-To: <CAJjvXiFAX7N-30g0OZ6idqLnyJww5dsyhGfLj6nYwKs9Xp--1g@mail.gmail.com>
References:  <20140601004242.GA97224@bewilderbeast.blackhelicopters.org> <CAOjFWZ5N9FGwgSz0_YFNQjavzdJDitRn52VKn4ipW1ddj6-weQ@mail.gmail.com> <BCA9F5D6-3925-4E7E-9082-128652508305@FreeBSD.org> <3D6974D83AE9495E890D9F3CA654FA94@multiplay.co.uk> <538B4CEF.2030801@freebsd.org> <1DB2D63312CE439A96B23EAADFA9436E@multiplay.co.uk> <538B4FD7.4090000@freebsd.org> <CAJjvXiFAX7N-30g0OZ6idqLnyJww5dsyhGfLj6nYwKs9Xp--1g@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 06/01/14 14:27, Matthew Ahrens wrote:
>
>>> I think you will get some objections to that, as it can have quite an
>>> impact
>>> on the performance for disks which are 512, due to the increased overhead
>>> of
>>> transfering 4k when only 512 is really required. This has a more dramatic
>>> impact on RAIDZx due too.
>>>
>>> Personally we run a custom kernel on our machines which has just this
>>> change
>>> in it to ensure capability with future disks, so I can confirm it does
>>> indeed
>>> have the desired effect :)
>>>
>> So the discussion here is related to what to do about the installer. The
>> current ZFS component unconditionally creates gnops all over the place to
>> set ashift to 4k. That's across the board worse: it has exactly the
>> performance impact of changing the default of this sysctl (whatever that
>> is), it can't easily be overridden (which the sysctl can), and it's a
>> horrible hack to boot. There are a few options:
>>
>> 1. Change the default of vfs.zfs.min_auto_ashift
>>
> This is probably a bad idea -- as others have mentioned, it can drastically
> impact space usage and performance on 512B disks, especially when using
> small ZFS blocks (e.g. for databases or VDI) and/or RAID-Z.  That said, it
> could be a reasonable default for specialized distros that are not used for
> these workloads (maybe FreeNAS or PCBSD?).
>
> 2. Have the same effect but in a vastly worse way by adjusting the
>> installer to create gnops
>> 3. Have ZFS choose by itself and decide to do that permanently.
>>
> If the device reports a 512B sector size, it would be great for ZFS to
> assume the device could be lying, and automatically determine the minimum
> ashift which gives good performance.  I think this could be done reasonably
> well for the common case by doing the following when each 512B-sector
> device is added:
>
> 1. do random 4KB writes to the disk to determine wIOPS@4K
> 2. do random 3.5KB writes to the disk to determine wIOPS@3.5K
>
> If wIOPS@4K > wIOPS@3.5K, assume 4KB sectors, otherwise assume 512B
> sectors.  (Note: I haven't tried this in practice; we will need to test it
> out and perhaps make some tweaks.)
>
> I don't have the time or hardware to implement and test this, but I'd be
> happy to mentor or code review.
>
> --matt

I think we basically don't have any lying disks anymore. The ATA code 
does a very good job of this -- most tell the truth, but in an odd way 
that gets reported up the stack. ada(4) has a quirks table for the ones 
that do not. If this is the only concern, then we should just stop 
telling people to worry about this.

My bigger concern is this pool upgrade one -- what if someone puts in a 
4K disk in the future?
-Nathan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?538C9207.9040806>