Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Sep 2014 20:26:35 -0600
From:      John Nielsen <lists@jnielsen.net>
To:        Aristedes Maniatis <ari@ish.com.au>
Cc:        freebsd-stable <freebsd-stable@freebsd.org>
Subject:   Re: getting to 4K disk blocks in ZFS
Message-ID:  <A6EEF325-B19F-4CDA-8285-A92BCED4BBEB@jnielsen.net>
In-Reply-To: <540FF3C4.6010305@ish.com.au>
References:  <540FF3C4.6010305@ish.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
> On Sep 10, 2014, at 12:46 AM, Aristedes Maniatis <ari@ish.com.au> wrote:
>=20
> As we all know, it is important to ensure that modern disks are set up pro=
perly with the correct block size. Everything is good if all the disks and t=
he pool are "ashift=3D9" (512 byte blocks). But as soon as one new drive req=
uires 4k blocks, performance drops through the floor of the enture pool.
>=20
>=20
> In order to upgrade there appear to be two separate things that must be do=
ne for a ZFS pool.
>=20
> 1. Create partitions on 4K boundaries. This is simple with the "-a 4k" opt=
ion in gpart, and it isn't hard to remove disks one at a time from a pool, r=
eformat them on the right boundaries and put them back. Hopefully you've lef=
t a few spare bytes on the disk to ensure that your partition doesn't get sm=
aller when you reinsert it to the pool.
>=20
> 2. Create a brand new pool which has ashift=3D12 and zfs send|receive all t=
he data over.
>=20
>=20
> I guess I don't understand enough about zpool to know why the pool itself h=
as a block size, since I understood ZFS to have variable stripe widths.
>=20
> The problem with step 2 is that you need to have enough hard disks spare t=
o create a whole new pool and throw away the old disks. Plus a disk controll=
er with lots of spare ports. Plus the ability to take the system offline for=
 hours or days while the migration happens.
>=20
> One way to reduce this slightly is to create a new pool with reduced redun=
dancy. For example, create a RAIDZ2 with two fake disks, then offline those d=
isks.

Lots of good info in other responses, I just wanted to address this part of y=
our message.

It should be a given that good backups are a requirement before you start an=
y of this. _Especially_ if you have to destroy the old pool in order to prov=
ide redundancy for the new pool.

I have done this ashift conversion and it was a bit of a nail-biting experie=
nce as you've anticipated. The one suggestion I have for improving on the ab=
ove is to use snapshots to minimize the downtime. Get an initial clone of th=
e pool during off-peak hours (if any), then you only need to take the system=
 down to send a "final" differential snapshot.

JN=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A6EEF325-B19F-4CDA-8285-A92BCED4BBEB>