Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Jan 2022 17:51:49 +0100
From:      Stefan Esser <se@FreeBSD.org>
To:        Florent Rivoire <florent@rivoire.fr>
Cc:        Rich <rincebrain@gmail.com>, freebsd-fs <freebsd-fs@freebsd.org>, Alan Somers <asomers@freebsd.org>
Subject:   Re: [zfs] recordsize: unexpected increase of disk usage when increasing it
Message-ID:  <c8db3f82-72e2-1c7b-2e77-055edb5043e4@FreeBSD.org>
In-Reply-To: <CADzRhsF%2Bs1NHudToY0J7Wn90D8gwaM16Ym43XXopoaWVQGS8CA@mail.gmail.com>
References:  <CADzRhsEsZMGE-SoeWLMG9NTtkwhhy6OGQQ046m9AxGFbp5h_kQ@mail.gmail.com> <CAOeNLuopaY3j7P030KO4LMwU3BOU5tXiu6gRsSKsDrFEuGKuaA@mail.gmail.com> <CAOtMX2h=miZt=6__oAhPVzsK9ReShy6nG%2BaTiudvK_jp2sQKJQ@mail.gmail.com> <CADzRhsF%2Bs1NHudToY0J7Wn90D8gwaM16Ym43XXopoaWVQGS8CA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--------------E05brz3x5OqYb0hq0WY09h7h
Content-Type: multipart/mixed; boundary="------------zQpsYyDXtdhILUIXo9qcTpjX";
 protected-headers="v1"
From: Stefan Esser <se@FreeBSD.org>
To: Florent Rivoire <florent@rivoire.fr>
Cc: Rich <rincebrain@gmail.com>, freebsd-fs <freebsd-fs@freebsd.org>,
 Alan Somers <asomers@freebsd.org>
Message-ID: <c8db3f82-72e2-1c7b-2e77-055edb5043e4@FreeBSD.org>
Subject: Re: [zfs] recordsize: unexpected increase of disk usage when
 increasing it
References: <CADzRhsEsZMGE-SoeWLMG9NTtkwhhy6OGQQ046m9AxGFbp5h_kQ@mail.gmail.com>
 <CAOeNLuopaY3j7P030KO4LMwU3BOU5tXiu6gRsSKsDrFEuGKuaA@mail.gmail.com>
 <CAOtMX2h=miZt=6__oAhPVzsK9ReShy6nG+aTiudvK_jp2sQKJQ@mail.gmail.com>
 <CADzRhsF+s1NHudToY0J7Wn90D8gwaM16Ym43XXopoaWVQGS8CA@mail.gmail.com>
In-Reply-To: <CADzRhsF+s1NHudToY0J7Wn90D8gwaM16Ym43XXopoaWVQGS8CA@mail.gmail.com>

--------------zQpsYyDXtdhILUIXo9qcTpjX
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Am 18.01.22 um 17:07 schrieb Florent Rivoire:
> On Tue, Jan 18, 2022 at 3:23 PM Alan Somers <asomers@freebsd.org> wrote=
:
>> However, I would suggest that you don't bother.  With a 128kB recsize,=

>> ZFS has something like a 1000:1 ratio of data:metadata.  In other
>> words, increasing your recsize can save you at most 0.1% of disk
>> space.  Basically, it doesn't matter.  What it _does_ matter for is
>> the tradeoff between write amplification and RAM usage.  1000:1 is
>> comparable to the disk:ram of many computers.  And performance is more=

>> sensitive to metadata access times than data access times.  So
>> increasing your recsize can help you keep a greater fraction of your
>> metadata in ARC.  OTOH, as you remarked increasing your recsize will
>> also increase write amplification.
>=20
> In the attached zdb files (for 128K recordsize), we can see that the
> "L0 ZFS plain file" objects are using 99.89% in my test zpool. So the
> ratio in my case is exactly 1000:1 like you said.
> I had that rule-of-thumb in mind, but thanks for reminding me !
>=20
> As quickly mentioned in my first email, the context is that I'm
> considering using a mirror of SSDs as "special devices" for a new
> zpool which will still be mainly made of magnetic HDDs (raidz2 of
> 5x3TB).

Why don't you use the SSDs as L2ARC and let the system place often
used data and meta-data in that cache?

I'm using 1/2 TB of a 1 TB SSD as L2ARC for a pool of comparable
size, and with the persistence offered by OpenZFS it is already
filled with relevant data when the system boots.

With zstd-2 compression I get a compression ratio of typically
2 to 3 for my data, which makes the SSD effectively hold more
than 1 TB, since the L2ARC stores compressed blocks. (BTW: The
L2ARC meta-data is kept in the ARC, therefore too big an L2ARC
can reduce the amount of RAM available for other data in the ARC
and for programs more than it helps ...)

After 1,5 years, smartcl reports 14 TB written, which is a small
fraction of the specified write resilience of my SSD (less than 3%).

In my work-loads I see an ARC efficiency of about 98,5% and a
L2ARC efficiency of 80%, resulting in 1/5 of 1.5% =3D 0.3% of disk
reads actually being served by the rotating disks.

Regards, STefan

--------------zQpsYyDXtdhILUIXo9qcTpjX--

--------------E05brz3x5OqYb0hq0WY09h7h
Content-Type: application/pgp-signature; name="OpenPGP_signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="OpenPGP_signature"

-----BEGIN PGP SIGNATURE-----

wsB5BAABCAAjFiEEo3HqZZwL7MgrcVMTR+u171r99UQFAmHm8CUFAwAAAAAACgkQR+u171r99UQR
7wf+Mz5+XiTE/GvC5UPPD9EZW7YmYQLeoSHBZ5mbBi+r1Je8n+nPler6eZbHvuD86YfI41FzVALP
2lRyXDZFb/CymamDbVAiZaFAWpQNoXho/2l1dzZOwT3zvYbO5VYuNvXZlqvGRYmR71DfSdp7HnZA
89si23od13P+eE+NdCj66CLR9KzXX0+J/EbgwGE4tHSrwRYj9y6AOjmZo/F93AdGo1TTHbAGyAc0
+lPGNbTreXfyxN+Mc7+GAqWn7G1qemOH1dAGn7k1FyRjXa7BLtIhs1TszmzAkATNI5w2ZJiTSU7S
NRvpmNQp3CYc6Rz7xdjfWtTle4/imog+B/r3sUvsmQ==
=+381
-----END PGP SIGNATURE-----

--------------E05brz3x5OqYb0hq0WY09h7h--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?c8db3f82-72e2-1c7b-2e77-055edb5043e4>