Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Jan 2012 12:18:35 +0200
From:      Daniel Kalchev <daniel@digsys.bg>
To:        Willem Jan Withagen <wjw@digiware.nl>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Question about  ZFS with log and cache on SSD with GPT
Message-ID:  <B07F552A-1DD8-4FEC-AB0B-4F24D2140C84@digsys.bg>
In-Reply-To: <4F1C3597.4040009@digiware.nl>
References:  <4F193D90.9020703@digiware.nl> <20120121162906.0000518c@unknown> <4F1B0177.8080909@digiware.nl> <20120121230616.00006267@unknown> <4F1BC493.10304@brockmann-consult.de> <4F1C3597.4040009@digiware.nl>

next in thread | previous in thread | raw e-mail | index | archive | help

On Jan 22, 2012, at 6:13 PM, Willem Jan Withagen wrote:

> On 22-1-2012 9:10, Peter Maloney wrote:
>>=20
>=20
>> In my testing, it made no difference. But as daniel mentioned:
>>=20
>>> With ZFS, the 'alignment' is on per-vdev -- therefore you will need =
to recreate the mirror vdevs again using gnop to make them 4k aligned.=20=

>> But I just resilvered to add my aligned disks and remove the old. If
>> that applies to erase boundaries, then it might have hurt my test.
>=20
> I'm not treally fluent in ZFS lingo, but the vdev is what makes up my
> zfsdata pool? And the alignment in there carries over to the caches
> underneath?
>=20
> So what is the consequence if ashift =3D 9, and the partitions are =
nicely
> aligned even on the rease-boundary=85.

ZFS zpool can have a number of "vdevs". These are pieces of storage, =
that ZFS uses to store your bits of data. ZFS will spread writing to all =
available vdevs at the time of writing. Each vdev may have different =
properties, the 'sector size' (the smallest unit for writing/reading the =
vdev) being one. In ZFS this is stored in the 'shift' property. It's a =
bit shift value really, so ashift=3D9 means 2^9 (512) bytes and =
ashift=3D12 means 2^12 (4096) bytes.

When you create a vdev in ZFS, by either "zpool create" or "zpool add" =
ZFS will check the sector sizes reported by each "drive" (which may be =
file, disk drive, SAN storage, any block device in fact) and use the =
largest one as the vdev's shift. This is done in order to not penalize =
large-sector participants in a vdev.

If you add/replace device within an existing vdev, the shift property =
does not change. I am not aware of any way to change ashift on the fly, =
short of recreating the vdev. Since in current ZFS you cannot remove a =
vdev, that means you will have to recreate the zpool.

Today, it is probably good idea to create all new zpools with at least =
an ashift value of 12 (4096 bytes), or perhaps even larger. Current =
drives are so huge, that wasted space will not be significant. But =
performance will be better.

This should be even more important for SSD drives used as ZFS storage =
(perhaps also for SLOG/ZIL and cache) because that will both make the =
drive live longer and improve significantly write performance.

I have not experimented with gnop-ing ZIL or cache, then removing the =
gnop and re-importing pool, but there is no reason why it should not =
work.

Daniel




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B07F552A-1DD8-4FEC-AB0B-4F24D2140C84>