Date: Mon, 31 Aug 2020 23:21:23 -0400 From: Allan Jude <allanjude@freebsd.org> To: status-updates@freebsdfoundation.org, freebsd-fs <freebsd-fs@freebsd.org>, openzfs-developer <developer@open-zfs.org> Subject: Re: ZSTD Project Weekly Status Update Message-ID: <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org> In-Reply-To: <ce721076-962a-ddf4-6886-0eafbbb418b1@freebsd.org> References: <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org> <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org> <bebcc0bb-7590-a04b-09ae-fa04e22d27dc@freebsd.org> <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org> <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org> <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org> <db71835b-9bb7-2722-fd02-194b97f1564e@freebsd.org> <e9597d9b-88e0-334f-d266-6cbbaf746855@freebsd.org> <738e1ca9-05b6-bc1f-468c-b5eee03643ab@freebsd.org> <ce721076-962a-ddf4-6886-0eafbbb418b1@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --cOqCH0FwlSANJziuARVdQxxFrVxoBvnXn Content-Type: multipart/mixed; boundary="oMssMiyxXYooZR3su48qSLzUNJTfo0vvN"; protected-headers="v1" From: Allan Jude <allanjude@freebsd.org> To: status-updates@freebsdfoundation.org, freebsd-fs <freebsd-fs@freebsd.org>, openzfs-developer <developer@open-zfs.org> Message-ID: <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org> Subject: Re: ZSTD Project Weekly Status Update References: <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org> <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org> <bebcc0bb-7590-a04b-09ae-fa04e22d27dc@freebsd.org> <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org> <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org> <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org> <db71835b-9bb7-2722-fd02-194b97f1564e@freebsd.org> <e9597d9b-88e0-334f-d266-6cbbaf746855@freebsd.org> <738e1ca9-05b6-bc1f-468c-b5eee03643ab@freebsd.org> <ce721076-962a-ddf4-6886-0eafbbb418b1@freebsd.org> In-Reply-To: <ce721076-962a-ddf4-6886-0eafbbb418b1@freebsd.org> --oMssMiyxXYooZR3su48qSLzUNJTfo0vvN Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable This is the eleventh weekly status report on my FreeBSD Foundation sponsored project to complete the integration of ZSTD compression into OpenZFS. As I continue to work on the future-proofing issue, I have also been lending a hand to the integration of OpenZFS into FreeBSD, and doing a bunch of reviewing and testing there. I have also been doing some benchmarking of the ZSTD feature. so far I have tried 4 different approaches with varying results. The test rig: A single socket Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (10 cores, 20 threads) 32 GB ram ZFS ARC max =3D 4GB 4x Samsung 860 EVO SSDs 1) using fio. This gives slightly more useful output, both bandwidth and IOPS but also has more detail about changes over time as well as latency etc. The downside here is that it doesn't really benchmark compression. By default fio uses fully random data that does not compress at all. This is a somewhat useful metric, and the differing results seen when varying blocksize is interesting. fio has an option, --buffer_compress_percentage=3D, to select how compressible you want the generated data to be. However, this just switches between random data, and a repeating pattern (by default null bytes). So different levels of zstd compression all give the same compression ratio (the level you ask fio to generate). This doesn't really provide the real-work use case of having a tradeoff where spending more time on compression results in a greater space savings. 2) I also used 'zfs recv' to create more repeatable writes. I generated a large dataset, 8 copies of the FreeBSD 12.1 source code, that rounds out to around 48 GB of uncompressed data, snapshoted it, and created a zfs send stream, stored on a tmpfs. Then I measured the time taken to zfs recv that stream, at different compression levels. I later also redid these experiments at different record sizes. The reason I chose to use 8 copies of the data was to make the runs long enough at the lower compression levels to get more consistent readings. The issue with this was also a limitation of my test setup, 4x striped SSDs, that tends to top out around 1.3 GB/sec of writes. So the difference between compression=3Doff, lz4, and zstd-1 was minimal. 3) I then the zfs recv based testing, but with only 1 copy of the source code (1.3 GB) but with the pool backed by a file on a tmpfs. Removing the SSDs from the equation. The raw write speed to the tmpfs was around 3.2GB/sec. 4) I also redid the fio based testing with a pool backed by a file on tmp= fs. I am not really satisfied with the quality of the results so far. Does Linux have something equivalent to FreeBSD's mdconfig, where I can create an arbitrarily number of arbitrarily sized memory-backed devices, that I could use to back the pool? A file-based vdev on a tmpfs just doesn't seem to provide the same type of results as I was expecting. Any other suggestions would be welcome. In the end the results will all be relative, which is mostly what we are looking to capture. How much faster/slow is zstd at different levels compared to lz4 and gzip, and how much more compression do you get in exchange for that trade-off. Hopefully next week there will be some pretty graphs. Thanks again to the FreeBSD Foundation for sponsoring this work. On 2020-08-25 22:22, Allan Jude wrote: > This is the tenth weekly status report on my FreeBSD Foundation > sponsored project to complete the integration of ZSTD compression into > OpenZFS. >=20 > Late last week the main pull request was merged, and ZSTD support is no= w > part of OpenZFS's trunk branch. >=20 > Last night, OpenZFS with ZSTD was imported into FreeBSD's -current bran= ch. >=20 > I am continuing to work on a number of things related to ZSTD, includin= g > future-proofing support (so upgrading ZSTD won't cause problems with > features like nopwrite), and improving the integration of ZSTD into > FreeBSD, including enabling support for booting from ZSTD compressed > datasets, and improving the performance of ZSTD on FreeBSD. >=20 > I'll also be adding some additional tests to make sure we detect any > issues when we do look at updating ZSTD. Additionally, I am working on = a > bunch of documentation around using ZSTD in ZFS. >=20 > For my benchmarking of ZSTD, I have been using a zfs recv of a stream i= n > a file on a tmpfs, and recording how long it takes to receive and sync > the data. The test data is a copy of the FreeBSD 12.1 source code, sinc= e > that is easily reproducible. >=20 > Does anyone have experience or a better suggestion on how to get the > most consistent and repeatable results when benchmarking like this? >=20 >=20 > On 2020-08-18 18:51, Allan Jude wrote: >> This is the ninth weekly status report on my FreeBSD Foundation >> sponsored project to complete the integration of ZSTD compression into= >> OpenZFS. >> >> https://github.com/openzfs/zfs/pull/10693 - The L2ARC fixes (for when >> compressed ARC is disabled) have been merged. >> >> https://github.com/openzfs/zfs/pull/10278/ - A number of other cleanup= s >> and fixes for the ZSTD have been integrated and squashed, and it looks= >> like the completed ZSTD feature will be merged very soon. >> >> This included a bunch of fixes for makefiles and runfiles to hook the >> tests I added up to the ZFS test suite so they are run properly. >> >> It looks like this will mean that the ZSTD feature will be included in= >> OpenZFS 2.0. Thanks for everyone who has tested, reviewed, or >> contributed to this effort, especially those who kept it alive while I= >> was working on other things. >> >> Post-merge, the remaining work is to develop future-proofing around ZS= TD >> so that we will be able to more seamlessly upgrade to newer releases o= f >> ZSTD. Recompression of the same input resulting in the same on-disk >> checksum is the main concern, as without this upgrading the compressio= n >> algorithm will break features like nop-write. >> >> This project is sponsored by the FreeBSD Foundation. >> >=20 >=20 --=20 Allan Jude --oMssMiyxXYooZR3su48qSLzUNJTfo0vvN-- --cOqCH0FwlSANJziuARVdQxxFrVxoBvnXn Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iQIcBAEBAgAGBQJfTb43AAoJEBmVNT4SmAt+2uYQAIk7yMCFMJPIUPRyFj+EHdij M2380I/5dy8QEE+7LMj31bAfSFqyEFuOr6cO2aeJB5dCb7G5usDqiX2iQqX5UfMr VIV60sXrzE7Ha43ztH91ddbjaOvbE4MgDUwLK+i7qiMg85paaj2n7HdaH4jW06q8 TfMyDiLulAHonfwEcu9xqIPB770669yPhF0yX8Bg7Tx9nzyjBdyHtrsOLGjdM2FV ull8xlFYHFf46mOZ0TsBXsGD5cKHCfOteMcRvXz1l9C4Qc9CZgrxEZ/5/esXoM8n g54ah9r71GtNZFU7kf9GJX949JDcYPhR0u+mdT4zHUCUzI/pklemsP4mxrriZl6K sPeCNXzhgJlqrziAJ2DqBW4IVu7BWnFXnZQ5aigLa4QBz4YeAklJI6zv0AYBJzpS 5hnnHU+pR6V/B/30mPDCRB9lGIgSzM00tKcQhX5/GdldZBE6x0cvSDf3c1eKF4oe kYhpJSMvvYQ00h1dvutlNtxehbWgKdOOxkMpbqYkIpEgIfVg3rgyK96NybLNMOyE HLKgnLQeb2U3jnI7mYjk2lLTAoS7sos9EaaKOJ73uCsjsb311xombKxqEJMrbPEB CrId1+oeJjJ50OtX7WGVwF9rpm4qncjC0ybudFNiHG1x0+3wgvgXBQPmGuF9/iai mRpFa261Mfbggxt5HHfu =uxxY -----END PGP SIGNATURE----- --cOqCH0FwlSANJziuARVdQxxFrVxoBvnXn--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9f4ff5f0-9b6c-7299-98ee-988964a11ade>