From owner-freebsd-fs@FreeBSD.ORG  Tue Jan 24 10:18:51 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0FDB7106566C
	for <freebsd-fs@freebsd.org>; Tue, 24 Jan 2012 10:18:51 +0000 (UTC)
	(envelope-from daniel@digsys.bg)
Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230])
	by mx1.freebsd.org (Postfix) with ESMTP id 71D798FC12
	for <freebsd-fs@freebsd.org>; Tue, 24 Jan 2012 10:18:49 +0000 (UTC)
Received: from digsys226-136.pip.digsys.bg (digsys226-136.pip.digsys.bg
	[193.68.136.226]) (authenticated bits=0)
	by smtp-sofia.digsys.bg (8.14.5/8.14.5) with ESMTP id q0OAIZop006799
	(version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
	Tue, 24 Jan 2012 12:18:42 +0200 (EET)
	(envelope-from daniel@digsys.bg)
Mime-Version: 1.0 (Apple Message framework v1251.1)
Content-Type: text/plain; charset=windows-1252
From: Daniel Kalchev <daniel@digsys.bg>
In-Reply-To: <4F1C3597.4040009@digiware.nl>
Date: Tue, 24 Jan 2012 12:18:35 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <B07F552A-1DD8-4FEC-AB0B-4F24D2140C84@digsys.bg>
References: <4F193D90.9020703@digiware.nl> <20120121162906.0000518c@unknown>
	<4F1B0177.8080909@digiware.nl> <20120121230616.00006267@unknown>
	<4F1BC493.10304@brockmann-consult.de>
	<4F1C3597.4040009@digiware.nl>
To: Willem Jan Withagen <wjw@digiware.nl>
X-Mailer: Apple Mail (2.1251.1)
Cc: freebsd-fs@freebsd.org
Subject: Re: Question about  ZFS with log and cache on SSD with GPT
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Jan 2012 10:18:51 -0000


On Jan 22, 2012, at 6:13 PM, Willem Jan Withagen wrote:

> On 22-1-2012 9:10, Peter Maloney wrote:
>>=20
>=20
>> In my testing, it made no difference. But as daniel mentioned:
>>=20
>>> With ZFS, the 'alignment' is on per-vdev -- therefore you will need =
to recreate the mirror vdevs again using gnop to make them 4k aligned.=20=

>> But I just resilvered to add my aligned disks and remove the old. If
>> that applies to erase boundaries, then it might have hurt my test.
>=20
> I'm not treally fluent in ZFS lingo, but the vdev is what makes up my
> zfsdata pool? And the alignment in there carries over to the caches
> underneath?
>=20
> So what is the consequence if ashift =3D 9, and the partitions are =
nicely
> aligned even on the rease-boundary=85.

ZFS zpool can have a number of "vdevs". These are pieces of storage, =
that ZFS uses to store your bits of data. ZFS will spread writing to all =
available vdevs at the time of writing. Each vdev may have different =
properties, the 'sector size' (the smallest unit for writing/reading the =
vdev) being one. In ZFS this is stored in the 'shift' property. It's a =
bit shift value really, so ashift=3D9 means 2^9 (512) bytes and =
ashift=3D12 means 2^12 (4096) bytes.

When you create a vdev in ZFS, by either "zpool create" or "zpool add" =
ZFS will check the sector sizes reported by each "drive" (which may be =
file, disk drive, SAN storage, any block device in fact) and use the =
largest one as the vdev's shift. This is done in order to not penalize =
large-sector participants in a vdev.

If you add/replace device within an existing vdev, the shift property =
does not change. I am not aware of any way to change ashift on the fly, =
short of recreating the vdev. Since in current ZFS you cannot remove a =
vdev, that means you will have to recreate the zpool.

Today, it is probably good idea to create all new zpools with at least =
an ashift value of 12 (4096 bytes), or perhaps even larger. Current =
drives are so huge, that wasted space will not be significant. But =
performance will be better.

This should be even more important for SSD drives used as ZFS storage =
(perhaps also for SLOG/ZIL and cache) because that will both make the =
drive live longer and improve significantly write performance.

I have not experimented with gnop-ing ZIL or cache, then removing the =
gnop and re-importing pool, but there is no reason why it should not =
work.

Daniel