Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Sep 2020 22:11:17 -0400
From:      Allan Jude <allanjude@freebsd.org>
To:        status-updates@freebsdfoundation.org, freebsd-fs <freebsd-fs@freebsd.org>, openzfs-developer <developer@open-zfs.org>
Subject:   Re: ZSTD Project Weekly Status Update
Message-ID:  <761f6571-87ae-679c-a3e3-316dbb16200b@freebsd.org>
In-Reply-To: <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org>
References:  <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org> <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org> <bebcc0bb-7590-a04b-09ae-fa04e22d27dc@freebsd.org> <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org> <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org> <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org> <db71835b-9bb7-2722-fd02-194b97f1564e@freebsd.org> <e9597d9b-88e0-334f-d266-6cbbaf746855@freebsd.org> <738e1ca9-05b6-bc1f-468c-b5eee03643ab@freebsd.org> <ce721076-962a-ddf4-6886-0eafbbb418b1@freebsd.org> <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3
Content-Type: multipart/mixed; boundary="oFi4rd53BK0shJQbCHJgR6idE1VSFG49l";
 protected-headers="v1"
From: Allan Jude <allanjude@freebsd.org>
To: status-updates@freebsdfoundation.org, freebsd-fs
 <freebsd-fs@freebsd.org>, openzfs-developer <developer@open-zfs.org>
Message-ID: <761f6571-87ae-679c-a3e3-316dbb16200b@freebsd.org>
Subject: Re: ZSTD Project Weekly Status Update
References: <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org>
 <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org>
 <bebcc0bb-7590-a04b-09ae-fa04e22d27dc@freebsd.org>
 <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org>
 <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org>
 <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org>
 <db71835b-9bb7-2722-fd02-194b97f1564e@freebsd.org>
 <e9597d9b-88e0-334f-d266-6cbbaf746855@freebsd.org>
 <738e1ca9-05b6-bc1f-468c-b5eee03643ab@freebsd.org>
 <ce721076-962a-ddf4-6886-0eafbbb418b1@freebsd.org>
 <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org>
In-Reply-To: <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org>

--oFi4rd53BK0shJQbCHJgR6idE1VSFG49l
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

This is another weekly status report on my FreeBSD Foundation sponsored
project to complete the integration of ZSTD compression into OpenZFS.

The first batch of benchmarks are complete, although they took longer
than expected to get good data.

I am still not entirely pleased with the data, as in some cases I am
running up against limitations of my device-under-test rather than the
performance limits of ZFS.

Here is what I have so far:

https://docs.google.com/spreadsheets/d/1TvCAIDzFsjuLuea7124q-1UtMd0C9amTg=
nXm2yPtiUQ/edit?usp=3Dsharing

A number of these tests were initially done on both FreeBSD and Linux on
the same machine, and the results were consistent within a 2% margin of
error, so I've taken to doing most of the tests only on FreeBSD, because
it is easier. I've struggled to get a good ramdisk solution on Ubuntu etc=
=2E

To walk you through the different tabs in the spreadsheet so far:

#1: fio SSD
This is a random write test to my pool made of 4 SSDs. This ran into the
performance limitations of the SSDs when testing the very fast
algorithms. Since the data generated by fio is completely
uncompressible, there is no gain from the higher compression levels.

#2: fio to ramdisk
To overcome the limitations of the first test, I did it again with a
ramdisk. Obviously this had to be a smaller dataset, since there is
limited memory available, but it does a much better job of showing how
the zstd-fast levels scale, and how they outperform LZ4, although you
cannot compare the compression, because the data is uncompressible.

#3: zfs recv to SSD
For this test, I created a dataset by extracting the FreeBSD src.txz
file 8 times (each to a different directory), then created a snapshot of
that, and send it to a file on a tmpfs.

I then timed zfs recv < /tmpfs/snapshot.zfs with each compression
algorithm. This allows you to compare the compression gain for the time
trade-off, but again ran into the throughput limitations of the SSDs, so
provides a bit less information about the performance of the higher
zstd-fast levels, but you can see the compression tradeoff.

I need to reconfigure my setup to re-do this benchmark using a ramdisk.

#4: large image file 128k
For this, i created an approximately 20GB tar file, by unxz'ding the
FreeBSD 12.1 src.txz and concatenating it 16 times. This provides the
best possible case for compression.

One of the major advantages of ZSTD is that the decompression throughput
stays relatively the same even as the compression level is increased. So
while writing a zstd-19 compressed block takes a lot longer than a
zstd-3 compressed block, both decompress at nearly the same speed.

This time I measured fio random read performance. Putting the
limitations of the SSDs to good use, this test shows the read
performance gains from reading compressed. Even though the disks top out
around 1.5 GB/sec, zstd-compressed data can be read at an effective rate
of over 5 GB/sec.

#5: large image file 1m
This is the same test, but done with zfs recordsize=3D1m

The larger record size unlocks higher compression ratios, and achieves
throughputs in excess of 6 GB/sec.

#6: large image file 16k
This is again the same test, but with zfs recordsize=3D16k
This is an approximation of reading from a large database with a 16k
page size.
The lower record size provides much less compression, and the smaller
blocks result in more overhead, but, there are still large performance
gains to be had from the compression, although they are much less drastic=
=2E

I would be interested in what other tests people might be interested in
seeing before I finish wearing these SSDs out.


Thanks again to the FreeBSD Foundation for sponsoring this work.


On 2020-08-31 23:21, Allan Jude wrote:
> This is the eleventh weekly status report on my FreeBSD Foundation
> sponsored project to complete the integration of ZSTD compression into
> OpenZFS.
>=20
> As I continue to work on the future-proofing issue, I have also been
> lending a hand to the integration of OpenZFS into FreeBSD, and doing a
> bunch of reviewing and testing there.
>=20
> I have also been doing some benchmarking of the ZSTD feature.
>=20
> so far I have tried 4 different approaches with varying results.
>=20
> The test rig:
> A single socket Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (10 cores, 20=

> threads)
> 32 GB ram
> ZFS ARC max =3D 4GB
> 4x Samsung 860 EVO SSDs
>=20
>=20
> 1) using fio. This gives slightly more useful output, both bandwidth an=
d
> IOPS but also has more detail about changes over time as well as latenc=
y
> etc.
>=20
> The downside here is that it doesn't really benchmark compression. By
> default fio uses fully random data that does not compress at all. This
> is a somewhat useful metric, and the differing results seen when varyin=
g
> blocksize is interesting.
>=20
> fio has an option, --buffer_compress_percentage=3D, to select how
> compressible you want the generated data to be. However, this just
> switches between random data, and a repeating pattern (by default null
> bytes). So different levels of zstd compression all give the same
> compression ratio (the level you ask fio to generate). This doesn't
> really provide the real-work use case of having a tradeoff where
> spending more time on compression results in a greater space savings.
>=20
> 2) I also used 'zfs recv' to create more repeatable writes. I generated=

> a large dataset, 8 copies of the FreeBSD 12.1 source code, that rounds
> out to around 48 GB of uncompressed data, snapshoted it, and created a
> zfs send stream, stored on a tmpfs. Then I measured the time taken to
> zfs recv that stream, at different compression levels. I later also
> redid these experiments at different record sizes.
>=20
> The reason I chose to use 8 copies of the data was to make the runs lon=
g
> enough at the lower compression levels to get more consistent readings.=

>=20
> The issue with this was also a limitation of my test setup, 4x striped
> SSDs, that tends to top out around 1.3 GB/sec of writes. So the
> difference between compression=3Doff, lz4, and zstd-1 was minimal.
>=20
> 3) I then the zfs recv based testing, but with only 1 copy of the sourc=
e
> code (1.3 GB) but with the pool backed by a file on a tmpfs. Removing
> the SSDs from the equation. The raw write speed to the tmpfs was around=

> 3.2GB/sec.
>=20
> 4) I also redid the fio based testing with a pool backed by a file on t=
mpfs.
>=20
>=20
> I am not really satisfied with the quality of the results so far.
>=20
> Does Linux have something equivalent to FreeBSD's mdconfig, where I can=

> create an arbitrarily number of arbitrarily sized memory-backed devices=
,
> that I could use to back the pool? A file-based vdev on a tmpfs just
> doesn't seem to provide the same type of results as I was expecting.
>=20
> Any other suggestions would be welcome.
>=20
>=20
>=20
> In the end the results will all be relative, which is mostly what we ar=
e
> looking to capture. How much faster/slow is zstd at different levels
> compared to lz4 and gzip, and how much more compression do you get in
> exchange for that trade-off.
>=20
> Hopefully next week there will be some pretty graphs.
>=20
> Thanks again to the FreeBSD Foundation for sponsoring this work.
>=20
>=20

--=20
Allan Jude


--oFi4rd53BK0shJQbCHJgR6idE1VSFG49l--

--j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQIcBAEBAgAGBQJfYXROAAoJEBmVNT4SmAt+jbgQAIGw2zEPtfQ8ITUIi0dQYgbY
A6S/fm4/AITmY7LQICIpB3csUHDnlByFtAizwq0tLQeYXH8odt6XDesEwn1rrMfG
dJPIHsrXEGjUEwui7FGP4mL4EiOO752youW3Rz/NxcSvBO5/H1Jqh0LUmO08bVpD
7rxQc4FY8LJQglYuaR+ooPIBTg5+Jc6BXm8bV8P0urZfZiUWQwyCzqXX+ujJxzNA
H8ywReLTkDIuSpvISt5Wb3oWXQQzD3KgZT+qB4ZAh02oMR1vbkSzls3R+mKgVs13
BednYTpolHjxRHLVWaMIB29sm1xHfPfc3vRmT3IjyvHEL5FiMwZpud8Ekcf9JFTi
be5JOabYqSCWYi+i9Md1/y6DRyp1fDbetIfzf6aVUHhu8iMjH9ptJb06en+VG73W
5apqs5MPDs5q6MTFIpfGmNNYa8NuzV8uEnWnouskhsDFae71YghyxkqiGXNZLTme
rVu+Uio1hlBOfgkBrorQaPsckqEtupeXcEM1HqQXyCua630UeM3C6ZXF7OoXGi2D
QoJYhX2Hp0JEC+yV8li0zkxbn1LEoXkSs8uRJYyHABMdmQ8vMWK+PAZGELJhPWcw
RoXiEaliFkWO2eO9ZTNdYkO0PHezAZHw8R0bAmLWbq7+5QZDk5dQJSGJaK2Y3I7M
tw/ef3joRGkdq/NbF2vw
=/2V2
-----END PGP SIGNATURE-----

--j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?761f6571-87ae-679c-a3e3-316dbb16200b>