Date: Fri, 11 Sep 2015 10:16:46 -0400 From: "Chad J. Milios" <milios@ccsys.com> To: "William A. Mahaffey III" <wam@hiwaay.net> Cc: FreeBSD Questions !!!! <freebsd-questions@freebsd.org> Subject: Re: followup storage question Message-ID: <3B589E85-4C75-4021-9B37-E022BC33AFA4@ccsys.com> In-Reply-To: <55F2D086.6060509@hiwaay.net> References: <55F2D086.6060509@hiwaay.net>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Sep 11, 2015, at 8:59 AM, William A. Mahaffey III <wam@hiwaay.net> wrot= e: >=20 >=20 >=20 > The Wiki page https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/9.0-RELEASE il= lustrates using gnop to enforce 4K alignment of gpt partitions for subsequen= t use by ZFS. However the gpart commands also use the '-a 4k' arguments, ali= gning partitions on 4k boundaries as I understand things. Is the gnop comman= d also necessary ? TIA & have a nice weekend. >=20 >=20 > --=20 >=20 > William A. Mahaffey III Yes, handling separately both facets of the same underlying issue is necessa= ry. Those facets being the partition's alignment upon the outer device and t= he partition's block size that the device node reports to ZFS. The latter can be done a different way, effectively, in later versions of Fre= eBSD there is a sysctl, vfs.zfs.min_auto_ashift which you can set to 12 for 4= 096 byte blocks or 9 for the default 512 bytes. (The ashift value is the exp= onent over the number 2 to get the number of bytes in a block.) The old gnop way still works just fine so I still use that method, personall= y. This definitely only has to be done when vdev(s) are added/created/replac= ed* on the pool, not on every mount/import, by then ZFS clearly listens to t= he formatting metadata it stamped on the vdev instead of what the ioctls of t= he device node say and so will always write larger and correctly aligned blo= cks. (I'm not sure the reverse direction, not a typical use, if it holds tru= e without gnop every time, and I know the min_auto_ashift won't help there, b= eing if for some reason you intend gnop for simulating smaller blocks to ZFS= from larger device node blocks, say you wanted to allow a certain amount of= write amplification for more efficiently storing lots of small files/direct= ories/metadata. In that case you may need to enable the gnop every time. I'm= not sure because I don't run any pools that may but I know you can if you w= ant for that reason, space overhead. It'd take some testing and actual measu= rement for me to confidently decide gnop can be subsequently skipped after t= he vdev initialization if going in that opposite direction was your goal. Ma= ybe someone chimes in here to let us know for sure. At any rate, gnop is by i= ts nature just about the fastest and lightest geom class under the sun and I= believe you can keep running thousands of instances busily in production an= d see no noticeable overhead.) *Yes, mind the gnop or sysctl for ashift whenever replacing as well, it's a v= dev property not copied as part of the data resilvering, it's decided by ZFS= for each vdev independently even though having mixed pools seems totally un= intuitive. I've seen where it's been forgotten at replace time. Then when yo= u do use it, it's sort of a pain to get gnop/ZFS to relinquish the vdev if y= ou do an online replace and then want to try to clear off the gnop mode. I'd= just leave it on there and upon reboot it'll disappear and ZFS will pick up= the real vdev and properly do what you want with it. There should be no pro= blem with years of uptime in the meantime and then coming up slightly differ= ently on next boot bypassing gnop and with all correct ashift.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3B589E85-4C75-4021-9B37-E022BC33AFA4>