Date: Fri, 11 Sep 2015 09:29:53 -0453.75 From: "William A. Mahaffey III" <wam@hiwaay.net> Cc: FreeBSD Questions !!!! <freebsd-questions@freebsd.org> Subject: Re: followup storage question Message-ID: <55F2E417.7040704@hiwaay.net> In-Reply-To: <3B589E85-4C75-4021-9B37-E022BC33AFA4@ccsys.com> References: <55F2D086.6060509@hiwaay.net> <3B589E85-4C75-4021-9B37-E022BC33AFA4@ccsys.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 09/11/15 09:23, Chad J. Milios wrote: >> On Sep 11, 2015, at 8:59 AM, William A. Mahaffey III <wam@hiwaay.net> = wrote: >> >> >> >> The Wiki page https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/9.0-RELEAS= E illustrates using gnop to enforce 4K alignment of gpt partitions for su= bsequent use by ZFS. However the gpart commands also use the '-a 4k' argu= ments, aligning partitions on 4k boundaries as I understand things. Is th= e gnop command also necessary ? TIA & have a nice weekend. >> >> >> --=20 >> >> William A. Mahaffey III > Yes, handling separately both facets of the same underlying issue is ne= cessary. Those facets being the partition's alignment upon the outer devi= ce and the partition's block size that the device node reports to ZFS. > > The latter can be done a different way, effectively, in later versions = of FreeBSD there is a sysctl, vfs.zfs.min_auto_ashift which you can set t= o 12 for 4096 byte blocks or 9 for the default 512 bytes. (The ashift val= ue is the exponent over the number 2 to get the number of bytes in a bloc= k.) > > The old gnop way still works just fine so I still use that method, pers= onally. This definitely only has to be done when vdev(s) are added/create= d/replaced* on the pool, not on every mount/import, by then ZFS clearly l= istens to the formatting metadata it stamped on the vdev instead of what = the ioctls of the device node say and so will always write larger and cor= rectly aligned blocks. (I'm not sure the reverse direction, not a typical= use, if it holds true without gnop every time, and I know the min_auto_a= shift won't help there, being if for some reason you intend gnop for simu= lating smaller blocks to ZFS from larger device node blocks, say you want= ed to allow a certain amount of write amplification for more efficiently = storing lots of small files/directories/metadata. In that case you may ne= ed to enable the gnop every time. I'm not sure because I don't run any po= ols that may but I know you can if you want for that reason, space overhe= ad. It'd take some testing and actual measurement for me to confidently d= ecide gnop can be subsequently skipped after the vdev initialization if g= oing in that opposite direction was your goal. Maybe someone chimes in he= re to let us know for sure. At any rate, gnop is by its nature just about= the fastest and lightest geom class under the sun and I believe you can = keep running thousands of instances busily in production and see no notic= eable overhead.) > > *Yes, mind the gnop or sysctl for ashift whenever replacing as well, it= 's a vdev property not copied as part of the data resilvering, it's decid= ed by ZFS for each vdev independently even though having mixed pools seem= s totally unintuitive. I've seen where it's been forgotten at replace tim= e. Then when you do use it, it's sort of a pain to get gnop/ZFS to relinq= uish the vdev if you do an online replace and then want to try to clear o= ff the gnop mode. I'd just leave it on there and upon reboot it'll disapp= ear and ZFS will pick up the real vdev and properly do what you want with= it. There should be no problem with years of uptime in the meantime and = then coming up slightly differently on next boot bypassing gnop and with = all correct ashift. Excellent, clear as a bell :-). Thanks. --=20 William A. Mahaffey III ---------------------------------------------------------------------- "The M1 Garand is without doubt the finest implement of war ever devised by man." -- Gen. George S. Patton Jr.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55F2E417.7040704>