From owner-freebsd-fs@freebsd.org Wed Sep 16 02:11:33 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id EDF383EA925 for ; Wed, 16 Sep 2020 02:11:33 +0000 (UTC) (envelope-from allanjude@freebsd.org) Received: from tor1-11.mx.scaleengine.net (tor1-11.mx.scaleengine.net [IPv6:2001:470:1:474::25]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Brk9d40Pqz43d9 for ; Wed, 16 Sep 2020 02:11:33 +0000 (UTC) (envelope-from allanjude@freebsd.org) Received: from [10.1.1.2] (Seawolf.HML3.ScaleEngine.net [209.51.186.28]) (Authenticated sender: allanjude.freebsd@scaleengine.com) by tor1-11.mx.scaleengine.net (Postfix) with ESMTPSA id 272B3A98C; Wed, 16 Sep 2020 02:11:27 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.10.3 tor1-11.mx.scaleengine.net 272B3A98C From: Allan Jude To: status-updates@freebsdfoundation.org, freebsd-fs , openzfs-developer References: <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org> <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org> <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org> <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org> <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org> <738e1ca9-05b6-bc1f-468c-b5eee03643ab@freebsd.org> <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org> Autocrypt: addr=allanjude@freebsd.org; prefer-encrypt=mutual; keydata= xsFNBFVwZcYBEADwrZDH0xe0ZVjc9ORCc6PcBLwS/RTXA6NkvpD6ea02pZ8lPOVgteuuugFc D34LdDbiWr+479vfrKBh+Y38GL0oZ0/13j10tIlDMHSa5BU0y6ACtnhupFvVlQ57+XaJAb/q 7qkfSiuxVwQ3FY3PL3cl1RrIP5eGHLA9hu4eVbu+FOX/q/XVKz49HaeIaxzo2Q54572VzIo6 C28McX9m65UL5fXMUGJDDLCItLmehZlHsQQ+uBxvODLFpVV2lUgDR/0rDa0B9zHZX8jY8qQ7 ZdCSy7CwClXI054CkXZCaBzgxYh/CotdI8ezmaw7NLs5vWNTxaDEFXaFMQtMVhvqQBpHkfOD 7rjjOmFw00nJL4FuPE5Yut0CPyx8vLjVmNJSt/Y8WxxmhutsqJYFgYfWl/vaWkrFLur/Zcmz IklwLw35HLsCZytCN5A3rGKdRbQjD6QPXOTJu0JPrJF6t2xFkWAT7oxnSV0ELhl2g+JfMMz2 Z1PDmS3NRnyEdqEm7NoRGXJJ7bgxDbN+9SXTyOletqGNXj/bSrBvhvZ0RQrzdHAPwQUfVSU2 qBhQEi2apSZstgVNMan0GUPqCdbE2zpysg+zT7Yhvf9EUQbzPL4LpdK1llT9fZbrdMzEXvEF oSvwJFdV3sqKmZc7b+E3PuxK6GTsKqaukd/3Cj8aLHG1T1im1QARAQABzSJBbGxhbiBKdWRl IDxhbGxhbmp1ZGVAZnJlZWJzZC5vcmc+wsF/BBMBAgApBQJVcGXGAhsjBQkSzAMABwsJCAcD AgEGFQgCCQoLBBYCAwECHgECF4AACgkQGZU1PhKYC34Muw/+JOKpSfhhysWFYiRXynGRDe07 Z6pVsn7DzrPUMRNZfHu8Uujmmy3p2nx9FelIY9yjd2UKHhug+whM54MiIFs90eCRVa4XEsPR 4FFAm0DAWrrb7qhZFcE/GhHdRWpZ341WAElWf6Puj2devtRjfYbikvj5+1V1QmDbju7cEw5D mEET44pTuD2VMRJpu2yZZzkM0i+wKFuPxlhqreufA1VNkZXI/rIfkYWK+nkXd9Efw3YdCyCQ zUgTUCb88ttSqcyhik/li1CDbXBpkzDCKI6I/8fAb7jjOC9LAtrZJrdgONywcVFoyK9ZN7EN AVA+xvYCmuYhR/3zHWH1g4hAm1v1+gIsufhajhfo8/wY1SetlzPaYkSkVQLqD8T6zZyhf+AN bC7ci44UsiKGAplB3phAXrtSPUEqM86kbnHg3fSx37kWKUiYNOnx4AC2VXvEiKsOBlpyt3dw WQbOtOYM+vkfbBwDtoGOOPYAKxc4LOIt9r+J8aD+gTooi9Eo5tvphATf9WkCpl9+aaGbSixB tUpvQMRnSMqTqq4Z7DeiG6VMRQIjsXDSLJEUqcfhnLFo0Ko/RiaHd5xyAQ4DhQ9QpkyQjjNf /3f/dYG7JAtoD30txaQ5V8uHrz210/77DRRX+HJjEj6xCxWUGvQgvEZf5XXyxeePvqZ+zQyT DX61bYw6w6bOwU0EVXBlxgEQAMy7YVnCCLN4oAOBVLZ5nUbVPvpUhsdA94/0/P+uqCIh28Cz ar56OCX0X19N/nAWecxL4H32zFbIRyDB2V/MEh4p9Qvyu/j4i1r3Ex5GhOT2hnit43Ng46z5 29Es4TijrHJP4/l/rB2VOqMKBS7Cq8zk1cWqaI9XZ59imxDNjtLLPPM+zQ1yE3OAMb475QwN UgWxTMw8rkA7CEaqeIn4sqpTSD5C7kT1Bh26+rbgJDZ77D6Uv1LaCZZOaW52okW3bFbdozV8 yM2u+xz2Qs8bHz67p+s+BlygryiOyYytpkiK6Iy4N7FTolyj5EIwCuqzfk0SaRHeOKX2ZRjC qatkgoD/t13PNT38V9tw3qZVOJDS0W6WM8VSg+F+bkM9LgJ8CmKV+Hj0k3pfGfYPOZJ/v18i +SmZmL/Uw2RghnwDWGAsPCKu4uZR777iw7n9Io6Vfxndw2dcS0e9klvFYoaGS6H2F13Asygr WBzFNGFQscN4mUW+ZYBzpTOcHkdT7w8WS55BmXYLna+dYer9/HaAuUrONjujukN4SPS1fMJ2 /CS/idAUKyyVVX5vozoNK2JVC1h1zUAVsdnmhEzNPsvBoqcVNfyqBFROEVLIPwq+lQMGNVjH ekLTKRWf59MEhUC2ztjSKkGmwdg73d6xSXMuq45EgIJV2wPvOgWQonoHH/kxABEBAAHCwWUE GAECAA8FAlVwZcYCGwwFCRLMAwAACgkQGZU1PhKYC34w5A//YViBtZyDV5O+SJT9FFO3lb9x Zdxf0trA3ooCt7gdBkdnBM6T5EmjgVZ3KYYyFfwXZVkteuCCycMF/zVw5eE9FL1+zz9gg663 nY9q2F77TZTKXVWOLlOV2bY+xaK94U4ytogOGhh9b4UnQ/Ct3+6aviCF78Go608BXbmF/GVT 7uhddemk7ItxM1gE5Hscx3saxGKlayaOsdPKeGTVJCDEtHDuOc7/+jGh5Zxpk/Hpi+DUt1ot 8e6hPYLIQa4uVx4f1xxxV858PQ7QysSLr9pTV7FAQ18JclCaMc7JWIa3homZQL/MNKOfST0S 2e+msuRwQo7AnnfFKBUtb02KwpA4GhWryhkjUh/kbVc1wmGxaU3DgXYQ5GV5+Zf4kk/wqr/7 KG0dkTz6NLCVLyDlmAzuFhf66DJ3zzz4yIo3pbDYi3HB/BwJXVSKB3Ko0oUo+6/qMrOIS02L s++QE/z7K12CCcs7WwOjfCYHK7VtE0Sr/PfybBdTbuDncOuAyAIeIKxdI2nmQHzl035hhvQX s4CSghsP319jAOQiIolCeSbTMD4QWMK8RL/Pe1FI1jC3Nw9s+jq8Dudtbcj2UwAP/STUEbJ9 5rznzuuhPjE0e++EU/RpWmcaIMK/z1zZDMN+ce2v1qzgV936ZhJ3iaVzyqbEE81gDxg3P+IM kiYh4ZtPB4Q= Subject: Re: ZSTD Project Weekly Status Update Message-ID: <761f6571-87ae-679c-a3e3-316dbb16200b@freebsd.org> Date: Tue, 15 Sep 2020 22:11:17 -0400 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3" X-Rspamd-Queue-Id: 4Brk9d40Pqz43d9 X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [0.00 / 15.00]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; local_wl_from(0.00)[freebsd.org] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Sep 2020 02:11:34 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3 Content-Type: multipart/mixed; boundary="oFi4rd53BK0shJQbCHJgR6idE1VSFG49l"; protected-headers="v1" From: Allan Jude To: status-updates@freebsdfoundation.org, freebsd-fs , openzfs-developer Message-ID: <761f6571-87ae-679c-a3e3-316dbb16200b@freebsd.org> Subject: Re: ZSTD Project Weekly Status Update References: <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org> <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org> <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org> <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org> <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org> <738e1ca9-05b6-bc1f-468c-b5eee03643ab@freebsd.org> <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org> In-Reply-To: <9f4ff5f0-9b6c-7299-98ee-988964a11ade@freebsd.org> --oFi4rd53BK0shJQbCHJgR6idE1VSFG49l Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable This is another weekly status report on my FreeBSD Foundation sponsored project to complete the integration of ZSTD compression into OpenZFS. The first batch of benchmarks are complete, although they took longer than expected to get good data. I am still not entirely pleased with the data, as in some cases I am running up against limitations of my device-under-test rather than the performance limits of ZFS. Here is what I have so far: https://docs.google.com/spreadsheets/d/1TvCAIDzFsjuLuea7124q-1UtMd0C9amTg= nXm2yPtiUQ/edit?usp=3Dsharing A number of these tests were initially done on both FreeBSD and Linux on the same machine, and the results were consistent within a 2% margin of error, so I've taken to doing most of the tests only on FreeBSD, because it is easier. I've struggled to get a good ramdisk solution on Ubuntu etc= =2E To walk you through the different tabs in the spreadsheet so far: #1: fio SSD This is a random write test to my pool made of 4 SSDs. This ran into the performance limitations of the SSDs when testing the very fast algorithms. Since the data generated by fio is completely uncompressible, there is no gain from the higher compression levels. #2: fio to ramdisk To overcome the limitations of the first test, I did it again with a ramdisk. Obviously this had to be a smaller dataset, since there is limited memory available, but it does a much better job of showing how the zstd-fast levels scale, and how they outperform LZ4, although you cannot compare the compression, because the data is uncompressible. #3: zfs recv to SSD For this test, I created a dataset by extracting the FreeBSD src.txz file 8 times (each to a different directory), then created a snapshot of that, and send it to a file on a tmpfs. I then timed zfs recv < /tmpfs/snapshot.zfs with each compression algorithm. This allows you to compare the compression gain for the time trade-off, but again ran into the throughput limitations of the SSDs, so provides a bit less information about the performance of the higher zstd-fast levels, but you can see the compression tradeoff. I need to reconfigure my setup to re-do this benchmark using a ramdisk. #4: large image file 128k For this, i created an approximately 20GB tar file, by unxz'ding the FreeBSD 12.1 src.txz and concatenating it 16 times. This provides the best possible case for compression. One of the major advantages of ZSTD is that the decompression throughput stays relatively the same even as the compression level is increased. So while writing a zstd-19 compressed block takes a lot longer than a zstd-3 compressed block, both decompress at nearly the same speed. This time I measured fio random read performance. Putting the limitations of the SSDs to good use, this test shows the read performance gains from reading compressed. Even though the disks top out around 1.5 GB/sec, zstd-compressed data can be read at an effective rate of over 5 GB/sec. #5: large image file 1m This is the same test, but done with zfs recordsize=3D1m The larger record size unlocks higher compression ratios, and achieves throughputs in excess of 6 GB/sec. #6: large image file 16k This is again the same test, but with zfs recordsize=3D16k This is an approximation of reading from a large database with a 16k page size. The lower record size provides much less compression, and the smaller blocks result in more overhead, but, there are still large performance gains to be had from the compression, although they are much less drastic= =2E I would be interested in what other tests people might be interested in seeing before I finish wearing these SSDs out. Thanks again to the FreeBSD Foundation for sponsoring this work. On 2020-08-31 23:21, Allan Jude wrote: > This is the eleventh weekly status report on my FreeBSD Foundation > sponsored project to complete the integration of ZSTD compression into > OpenZFS. >=20 > As I continue to work on the future-proofing issue, I have also been > lending a hand to the integration of OpenZFS into FreeBSD, and doing a > bunch of reviewing and testing there. >=20 > I have also been doing some benchmarking of the ZSTD feature. >=20 > so far I have tried 4 different approaches with varying results. >=20 > The test rig: > A single socket Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (10 cores, 20= > threads) > 32 GB ram > ZFS ARC max =3D 4GB > 4x Samsung 860 EVO SSDs >=20 >=20 > 1) using fio. This gives slightly more useful output, both bandwidth an= d > IOPS but also has more detail about changes over time as well as latenc= y > etc. >=20 > The downside here is that it doesn't really benchmark compression. By > default fio uses fully random data that does not compress at all. This > is a somewhat useful metric, and the differing results seen when varyin= g > blocksize is interesting. >=20 > fio has an option, --buffer_compress_percentage=3D, to select how > compressible you want the generated data to be. However, this just > switches between random data, and a repeating pattern (by default null > bytes). So different levels of zstd compression all give the same > compression ratio (the level you ask fio to generate). This doesn't > really provide the real-work use case of having a tradeoff where > spending more time on compression results in a greater space savings. >=20 > 2) I also used 'zfs recv' to create more repeatable writes. I generated= > a large dataset, 8 copies of the FreeBSD 12.1 source code, that rounds > out to around 48 GB of uncompressed data, snapshoted it, and created a > zfs send stream, stored on a tmpfs. Then I measured the time taken to > zfs recv that stream, at different compression levels. I later also > redid these experiments at different record sizes. >=20 > The reason I chose to use 8 copies of the data was to make the runs lon= g > enough at the lower compression levels to get more consistent readings.= >=20 > The issue with this was also a limitation of my test setup, 4x striped > SSDs, that tends to top out around 1.3 GB/sec of writes. So the > difference between compression=3Doff, lz4, and zstd-1 was minimal. >=20 > 3) I then the zfs recv based testing, but with only 1 copy of the sourc= e > code (1.3 GB) but with the pool backed by a file on a tmpfs. Removing > the SSDs from the equation. The raw write speed to the tmpfs was around= > 3.2GB/sec. >=20 > 4) I also redid the fio based testing with a pool backed by a file on t= mpfs. >=20 >=20 > I am not really satisfied with the quality of the results so far. >=20 > Does Linux have something equivalent to FreeBSD's mdconfig, where I can= > create an arbitrarily number of arbitrarily sized memory-backed devices= , > that I could use to back the pool? A file-based vdev on a tmpfs just > doesn't seem to provide the same type of results as I was expecting. >=20 > Any other suggestions would be welcome. >=20 >=20 >=20 > In the end the results will all be relative, which is mostly what we ar= e > looking to capture. How much faster/slow is zstd at different levels > compared to lz4 and gzip, and how much more compression do you get in > exchange for that trade-off. >=20 > Hopefully next week there will be some pretty graphs. >=20 > Thanks again to the FreeBSD Foundation for sponsoring this work. >=20 >=20 --=20 Allan Jude --oFi4rd53BK0shJQbCHJgR6idE1VSFG49l-- --j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iQIcBAEBAgAGBQJfYXROAAoJEBmVNT4SmAt+jbgQAIGw2zEPtfQ8ITUIi0dQYgbY A6S/fm4/AITmY7LQICIpB3csUHDnlByFtAizwq0tLQeYXH8odt6XDesEwn1rrMfG dJPIHsrXEGjUEwui7FGP4mL4EiOO752youW3Rz/NxcSvBO5/H1Jqh0LUmO08bVpD 7rxQc4FY8LJQglYuaR+ooPIBTg5+Jc6BXm8bV8P0urZfZiUWQwyCzqXX+ujJxzNA H8ywReLTkDIuSpvISt5Wb3oWXQQzD3KgZT+qB4ZAh02oMR1vbkSzls3R+mKgVs13 BednYTpolHjxRHLVWaMIB29sm1xHfPfc3vRmT3IjyvHEL5FiMwZpud8Ekcf9JFTi be5JOabYqSCWYi+i9Md1/y6DRyp1fDbetIfzf6aVUHhu8iMjH9ptJb06en+VG73W 5apqs5MPDs5q6MTFIpfGmNNYa8NuzV8uEnWnouskhsDFae71YghyxkqiGXNZLTme rVu+Uio1hlBOfgkBrorQaPsckqEtupeXcEM1HqQXyCua630UeM3C6ZXF7OoXGi2D QoJYhX2Hp0JEC+yV8li0zkxbn1LEoXkSs8uRJYyHABMdmQ8vMWK+PAZGELJhPWcw RoXiEaliFkWO2eO9ZTNdYkO0PHezAZHw8R0bAmLWbq7+5QZDk5dQJSGJaK2Y3I7M tw/ef3joRGkdq/NbF2vw =/2V2 -----END PGP SIGNATURE----- --j8B4AhH8vb0mk7R5dJnFA0CiGaQNYn6F3--