Date: Tue, 18 Sep 2012 16:40:11 +0300 From: Daniel Kalchev <daniel@digsys.bg> To: Volodymyr Kostyrko <c.kworr@gmail.com> Cc: freebsd-fs@freebsd.org Subject: Re: AW: AW: AW: AW: AW: ZFS: Corrupted pool metadata after adding vdev to a pool - no opportunity to rescue data from healthy vdevs? Remove a vdev? Rewrite metadata? Message-ID: <505879BB.3000806@digsys.bg> In-Reply-To: <505874E6.2050109@gmail.com> References: <001a01cd900d$bcfcc870$36f65950$@goelli.de> <504F282D.8030808@gmail.com> <000a01cd90aa$0a277310$1e765930$@goelli.de> <5050461A.9050608@gmail.com> <000001cd9239$ed734c80$c859e580$@goelli.de> <5052EC5D.4060403@gmail.com> <000a01cd9274$0aa0bba0$1fe232e0$@goelli.de> <505322C9.70200@gmail.com> <000001cd9377$e9e9b010$bdbd1030$@goelli.de> <50559CD8.1070700@gmail.com> <000001cd94f1$a4157030$ec405090$@goelli.de> <50581033.4040102@gmail.com> <50584CC1.3030300@digsys.bg> <505874E6.2050109@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 18.09.12 16:19, Volodymyr Kostyrko wrote: > 18.09.2012 13:28, Daniel Kalchev wrote: >> >> The problem is that ZFS writes these records (even 128K) aligned to the >> sector size. So, once you write some data that is under 4k, your pool >> will become misaligned. > > Not exactly. https://blogs.oracle.com/bonwick/entry/space_maps There is no statement in this post that contradicts with what I commented already. I may have been not precise enough -- the mis-alignment might happen within the metaslab, not the whole zpool. ZFS clearly does not write larger blocks than necessary, the smallest being the sector size. The sector size is represented by the ashift value. Sector size being 2^ashift. The ashift value is on per-vdev basis and is calculated as the largest sector size of the vdev members. So if you create an vdev mirror of two drives that report 512byte sectors to the OS, the resulting vdev will have ashift=9. If you create an mirror vdev from one drive that reports 512b sectors and another that report 4096b sectors, then you will have ashift=12. You do not need to have all vdevs in an zpool having the same ashift value (and thus the same sector size). > >>> 2. For older drives each drive should be partitioned with respect to >>> 4k sectors. This is what -a option of gpart does: it aligns created >>> partitions to 4k sector bounds. But half a year ago I already found >>> some drives that can auto-shift all disk transactions to optimize read >>> and write performance. Courtesy of Microsoft Windows, OS that does not >>> care about anything not written in license terms, same as the users >>> do, so using this drives would be more straightforward and would not >>> cause decent pain to IT stuff about realigning partitions the way it >>> would just work. >>> >> >> This is only hype. There is no way any disk firmware can shift any >> transactions. > > How about Seagate Smart Align? It's documented to do so. I haven't > touched any Seagate drives as I don't like them anyway... > I have a lot of Seagate drives with 4k sectors in use with ZFS. Despite these claims, performance is far worse if writes are not aligned to 4k. It is also awful with UFS if you don't care to align partitions. This is just marketing. Their rewrite implementation might be better than others, but still is better avoided. Daniel
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?505879BB.3000806>