Date: Tue, 15 Sep 2020 22:11:17 -0400 From: Allan Jude <allanjude@freebsd.org> To: status-updates@freebsdfoundation.org, freebsd-fs <freebsd-fs@freebsd.org>, openzfs-developer <developer@open-zfs.org> Subject: Re: ZSTD Project Weekly Status Update Message-ID: <761f6571-87ae-679c-a3e3-316dbb16200b@freebsd.org> In-Reply-To: <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org> References: <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org> <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org> <bebcc0bb-7590-a04b-09ae-fa04e22d27dc@freebsd.org> <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org> <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org> <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org> <db71835b-9bb7-2722-fd02-194b97f1564e@freebsd.org> <e9597d9b-88e0-334f-d266-6cbbaf746855@freebsd.org> <738e1ca9-05b6-bc1f-468c-b5eee03643ab@freebsd.org> <ce721076-962a-ddf4-6886-0eafbbb418b1@freebsd.org> <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3 Content-Type: multipart/mixed; boundary="oFi4rd53BK0shJQbCHJgR6idE1VSFG49l"; protected-headers="v1" From: Allan Jude <allanjude@freebsd.org> To: status-updates@freebsdfoundation.org, freebsd-fs <freebsd-fs@freebsd.org>, openzfs-developer <developer@open-zfs.org> Message-ID: <761f6571-87ae-679c-a3e3-316dbb16200b@freebsd.org> Subject: Re: ZSTD Project Weekly Status Update References: <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org> <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org> <bebcc0bb-7590-a04b-09ae-fa04e22d27dc@freebsd.org> <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org> <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org> <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org> <db71835b-9bb7-2722-fd02-194b97f1564e@freebsd.org> <e9597d9b-88e0-334f-d266-6cbbaf746855@freebsd.org> <738e1ca9-05b6-bc1f-468c-b5eee03643ab@freebsd.org> <ce721076-962a-ddf4-6886-0eafbbb418b1@freebsd.org> <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org> In-Reply-To: <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org> --oFi4rd53BK0shJQbCHJgR6idE1VSFG49l Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable This is another weekly status report on my FreeBSD Foundation sponsored project to complete the integration of ZSTD compression into OpenZFS. The first batch of benchmarks are complete, although they took longer than expected to get good data. I am still not entirely pleased with the data, as in some cases I am running up against limitations of my device-under-test rather than the performance limits of ZFS. Here is what I have so far: https://docs.google.com/spreadsheets/d/1TvCAIDzFsjuLuea7124q-1UtMd0C9amTg= nXm2yPtiUQ/edit?usp=3Dsharing A number of these tests were initially done on both FreeBSD and Linux on the same machine, and the results were consistent within a 2% margin of error, so I've taken to doing most of the tests only on FreeBSD, because it is easier. I've struggled to get a good ramdisk solution on Ubuntu etc= =2E To walk you through the different tabs in the spreadsheet so far: #1: fio SSD This is a random write test to my pool made of 4 SSDs. This ran into the performance limitations of the SSDs when testing the very fast algorithms. Since the data generated by fio is completely uncompressible, there is no gain from the higher compression levels. #2: fio to ramdisk To overcome the limitations of the first test, I did it again with a ramdisk. Obviously this had to be a smaller dataset, since there is limited memory available, but it does a much better job of showing how the zstd-fast levels scale, and how they outperform LZ4, although you cannot compare the compression, because the data is uncompressible. #3: zfs recv to SSD For this test, I created a dataset by extracting the FreeBSD src.txz file 8 times (each to a different directory), then created a snapshot of that, and send it to a file on a tmpfs. I then timed zfs recv < /tmpfs/snapshot.zfs with each compression algorithm. This allows you to compare the compression gain for the time trade-off, but again ran into the throughput limitations of the SSDs, so provides a bit less information about the performance of the higher zstd-fast levels, but you can see the compression tradeoff. I need to reconfigure my setup to re-do this benchmark using a ramdisk. #4: large image file 128k For this, i created an approximately 20GB tar file, by unxz'ding the FreeBSD 12.1 src.txz and concatenating it 16 times. This provides the best possible case for compression. One of the major advantages of ZSTD is that the decompression throughput stays relatively the same even as the compression level is increased. So while writing a zstd-19 compressed block takes a lot longer than a zstd-3 compressed block, both decompress at nearly the same speed. This time I measured fio random read performance. Putting the limitations of the SSDs to good use, this test shows the read performance gains from reading compressed. Even though the disks top out around 1.5 GB/sec, zstd-compressed data can be read at an effective rate of over 5 GB/sec. #5: large image file 1m This is the same test, but done with zfs recordsize=3D1m The larger record size unlocks higher compression ratios, and achieves throughputs in excess of 6 GB/sec. #6: large image file 16k This is again the same test, but with zfs recordsize=3D16k This is an approximation of reading from a large database with a 16k page size. The lower record size provides much less compression, and the smaller blocks result in more overhead, but, there are still large performance gains to be had from the compression, although they are much less drastic= =2E I would be interested in what other tests people might be interested in seeing before I finish wearing these SSDs out. Thanks again to the FreeBSD Foundation for sponsoring this work. On 2020-08-31 23:21, Allan Jude wrote: > This is the eleventh weekly status report on my FreeBSD Foundation > sponsored project to complete the integration of ZSTD compression into > OpenZFS. >=20 > As I continue to work on the future-proofing issue, I have also been > lending a hand to the integration of OpenZFS into FreeBSD, and doing a > bunch of reviewing and testing there. >=20 > I have also been doing some benchmarking of the ZSTD feature. >=20 > so far I have tried 4 different approaches with varying results. >=20 > The test rig: > A single socket Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (10 cores, 20= > threads) > 32 GB ram > ZFS ARC max =3D 4GB > 4x Samsung 860 EVO SSDs >=20 >=20 > 1) using fio. This gives slightly more useful output, both bandwidth an= d > IOPS but also has more detail about changes over time as well as latenc= y > etc. >=20 > The downside here is that it doesn't really benchmark compression. By > default fio uses fully random data that does not compress at all. This > is a somewhat useful metric, and the differing results seen when varyin= g > blocksize is interesting. >=20 > fio has an option, --buffer_compress_percentage=3D, to select how > compressible you want the generated data to be. However, this just > switches between random data, and a repeating pattern (by default null > bytes). So different levels of zstd compression all give the same > compression ratio (the level you ask fio to generate). This doesn't > really provide the real-work use case of having a tradeoff where > spending more time on compression results in a greater space savings. >=20 > 2) I also used 'zfs recv' to create more repeatable writes. I generated= > a large dataset, 8 copies of the FreeBSD 12.1 source code, that rounds > out to around 48 GB of uncompressed data, snapshoted it, and created a > zfs send stream, stored on a tmpfs. Then I measured the time taken to > zfs recv that stream, at different compression levels. I later also > redid these experiments at different record sizes. >=20 > The reason I chose to use 8 copies of the data was to make the runs lon= g > enough at the lower compression levels to get more consistent readings.= >=20 > The issue with this was also a limitation of my test setup, 4x striped > SSDs, that tends to top out around 1.3 GB/sec of writes. So the > difference between compression=3Doff, lz4, and zstd-1 was minimal. >=20 > 3) I then the zfs recv based testing, but with only 1 copy of the sourc= e > code (1.3 GB) but with the pool backed by a file on a tmpfs. Removing > the SSDs from the equation. The raw write speed to the tmpfs was around= > 3.2GB/sec. >=20 > 4) I also redid the fio based testing with a pool backed by a file on t= mpfs. >=20 >=20 > I am not really satisfied with the quality of the results so far. >=20 > Does Linux have something equivalent to FreeBSD's mdconfig, where I can= > create an arbitrarily number of arbitrarily sized memory-backed devices= , > that I could use to back the pool? A file-based vdev on a tmpfs just > doesn't seem to provide the same type of results as I was expecting. >=20 > Any other suggestions would be welcome. >=20 >=20 >=20 > In the end the results will all be relative, which is mostly what we ar= e > looking to capture. How much faster/slow is zstd at different levels > compared to lz4 and gzip, and how much more compression do you get in > exchange for that trade-off. >=20 > Hopefully next week there will be some pretty graphs. >=20 > Thanks again to the FreeBSD Foundation for sponsoring this work. >=20 >=20 --=20 Allan Jude --oFi4rd53BK0shJQbCHJgR6idE1VSFG49l-- --j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iQIcBAEBAgAGBQJfYXROAAoJEBmVNT4SmAt+jbgQAIGw2zEPtfQ8ITUIi0dQYgbY A6S/fm4/AITmY7LQICIpB3csUHDnlByFtAizwq0tLQeYXH8odt6XDesEwn1rrMfG dJPIHsrXEGjUEwui7FGP4mL4EiOO752youW3Rz/NxcSvBO5/H1Jqh0LUmO08bVpD 7rxQc4FY8LJQglYuaR+ooPIBTg5+Jc6BXm8bV8P0urZfZiUWQwyCzqXX+ujJxzNA H8ywReLTkDIuSpvISt5Wb3oWXQQzD3KgZT+qB4ZAh02oMR1vbkSzls3R+mKgVs13 BednYTpolHjxRHLVWaMIB29sm1xHfPfc3vRmT3IjyvHEL5FiMwZpud8Ekcf9JFTi be5JOabYqSCWYi+i9Md1/y6DRyp1fDbetIfzf6aVUHhu8iMjH9ptJb06en+VG73W 5apqs5MPDs5q6MTFIpfGmNNYa8NuzV8uEnWnouskhsDFae71YghyxkqiGXNZLTme rVu+Uio1hlBOfgkBrorQaPsckqEtupeXcEM1HqQXyCua630UeM3C6ZXF7OoXGi2D QoJYhX2Hp0JEC+yV8li0zkxbn1LEoXkSs8uRJYyHABMdmQ8vMWK+PAZGELJhPWcw RoXiEaliFkWO2eO9ZTNdYkO0PHezAZHw8R0bAmLWbq7+5QZDk5dQJSGJaK2Y3I7M tw/ef3joRGkdq/NbF2vw =/2V2 -----END PGP SIGNATURE----- --j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?761f6571-87ae-679c-a3e3-316dbb16200b>