FreeBSD Mail Archives

Date:      Tue, 18 Sep 2012 16:19:34 +0300
From:      Volodymyr Kostyrko <c.kworr@gmail.com>
To:        Daniel Kalchev <daniel@digsys.bg>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: AW: AW: AW: AW: AW: ZFS: Corrupted pool metadata after adding vdev to a pool - no opportunity to rescue data from healthy vdevs? Remove a vdev? Rewrite metadata?
Message-ID:  <505874E6.2050109@gmail.com>
In-Reply-To: <50584CC1.3030300@digsys.bg>
References:  <001a01cd900d$bcfcc870$36f65950$@goelli.de> <504F282D.8030808@gmail.com> <000a01cd90aa$0a277310$1e765930$@goelli.de> <5050461A.9050608@gmail.com> <000001cd9239$ed734c80$c859e580$@goelli.de> <5052EC5D.4060403@gmail.com> <000a01cd9274$0aa0bba0$1fe232e0$@goelli.de> <505322C9.70200@gmail.com> <000001cd9377$e9e9b010$bdbd1030$@goelli.de> <50559CD8.1070700@gmail.com> <000001cd94f1$a4157030$ec405090$@goelli.de> <50581033.4040102@gmail.com> <50584CC1.3030300@digsys.bg>

index | next in thread | previous in thread | raw e-mail

18.09.2012 13:28, Daniel Kalchev wrote:
>> From my point of view all hype about moving to 4k sectors is highly
>> irrelevant to ZFS and current products on the market.
>>
>> 1. ZFS tends to use big recordsize for storing any data. This means
>> most files on your drives are already stored in 128k sectors. Storing
>> small tails in 512b or 4k sectors shouldn't give big difference.
>
> Truth is, ZFS will write blocks of size from your media sector size up
> to 128K.
>
> The problem is that ZFS writes these records (even 128K) aligned to the
> sector size. So, once you write some data that is under 4k, your pool
> will become misaligned.

Not exactly. https://blogs.oracle.com/bonwick/entry/space_maps

1. ZFS divides the space on each virtual device into a few hundred 
metaslabs.
2. As Metaslabs are quite big so it's quite logical to make them aligned 
with high ashift value (I miss documentations on wheter this is true, 
but at least they should be dividable by 128k as this is default 
recordsize).
3. In each metaslab all space allocation is done through space maps. I 
have no documentation on this one either but due to a presence of gang 
blocks in ZFS specification all new allocation should be aligned to 128k 
if we are allocating 128k block, aligned to 64k if we are allocating 64k 
block and so on (yet again, I miss documentation on wheter this is true, 
but as far I understand Solaris way it's more practical to have data 
aligned then later dealing with it).

I'm bad at reading code so I can't really say how allocations are 
aligned on ZFS metaslabs, but function dealing with metaslab allocation 
takes one 'align' variable.

>> 2. For older drives each drive should be partitioned with respect to
>> 4k sectors. This is what -a option of gpart does: it aligns created
>> partitions to 4k sector bounds. But half a year ago I already found
>> some drives that can auto-shift all disk transactions to optimize read
>> and write performance. Courtesy of Microsoft Windows, OS that does not
>> care about anything not written in license terms, same as the users
>> do, so using this drives would be more straightforward and would not
>> cause decent pain to IT stuff about realigning partitions the way it
>> would just work.
>>
>
> This is only hype. There is no way any disk firmware can shift any
> transactions.

How about Seagate Smart Align? It's documented to do so. I haven't 
touched any Seagate drives as I don't like them anyway...

-- 
Sphinx of black quartz judge my vow.

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?505874E6.2050109>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation