Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Aug 2020 22:49:44 -0400
From:      Allan Jude <allanjude@freebsd.org>
To:        status-updates@freebsdfoundation.org, freebsd-fs <freebsd-fs@freebsd.org>, openzfs-developer <developer@open-zfs.org>
Subject:   Re: ZSTD Project Weekly Status Update
Message-ID:  <db71835b-9bb7-2722-fd02-194b97f1564e@freebsd.org>
In-Reply-To: <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org>
References:  <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org> <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org> <bebcc0bb-7590-a04b-09ae-fa04e22d27dc@freebsd.org> <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org> <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org> <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--R8vzuaKLhzyCjf7STmUzjL8AXKZS0tooK
Content-Type: multipart/mixed; boundary="UkN47Zpoz8FxP3sgMe7uV1jD4vlSCfYuM";
 protected-headers="v1"
From: Allan Jude <allanjude@freebsd.org>
To: status-updates@freebsdfoundation.org, freebsd-fs
 <freebsd-fs@freebsd.org>, openzfs-developer <developer@open-zfs.org>
Message-ID: <db71835b-9bb7-2722-fd02-194b97f1564e@freebsd.org>
Subject: Re: ZSTD Project Weekly Status Update
References: <7b8842ad-d520-c575-22ee-2cd77244f2c6@freebsd.org>
 <708ec9f2-3c5c-6452-f6e6-bfb11a7f7eb2@freebsd.org>
 <bebcc0bb-7590-a04b-09ae-fa04e22d27dc@freebsd.org>
 <528ca743-7889-d1fd-ca95-a17cd430725b@freebsd.org>
 <9d77cb73-c8e8-cca0-b4b8-28e6790268d6@freebsd.org>
 <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org>
In-Reply-To: <327f4b10-9727-331e-2dc9-641dad96dd2a@freebsd.org>

--UkN47Zpoz8FxP3sgMe7uV1jD4vlSCfYuM
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

This is the seventh weekly status report on the project to integrate
ZSTD into OpenZFS.

The compatibility related changes I created last week were refined and
marged into the mainline branch.

Thanks to Brian Behlendorf for reviewing my proposed change for the zstd
feature flag activation, and pointing out a better approach. I have
reworked the patch based on his suggestion and prototype:

https://github.com/allanjude/zfs/commit/2508dafcec0a05d61afc5fbd5da356e20=
1afbe97
- Activate the per-dataset ZSTD feature flag as soon as the property is
set to ZSTD. Before, simply doing `zfs set compression=3Dzstd dataset`
would not activate the feature flag. The feature flag would be activated
when the first block that used ZSTD compression was written (see
dsl_dataset_block_born()). This meant that if you set the property,
exported the pool, the pool would import on systems with older versions
of ZFS that did not support ZSTD, but would crash their userspace tools,
because the property value was out of bounds.


https://github.com/allanjude/zfs/commit/b8bec3fd2a8feb3a4de572eb15515d376=
4f92a35
- I created a test that ensures that the feature flag is activated by
`zfs set compression=3Dzstd` and also ensures that the feature flag
reverts to the 'enabled' state once the last dataset using zstd is
destroyed.


The next step is ensuring that ZSTD compression inter-operates properly
with the L2ARC and Encryption etc.

I've also been discussing ideas with Brian about future-proofing, to
handle the case where a newer version of ZSTD might compression the same
input differently (better ratio), and how that would impact L2ARC,
nop-write, etc. One idea (originally from Pawel Dawidek) is to do
something similar to what encryption does, and split the checksum field.
Using half to checksum the original data, and half the compressed
version. This would allow ZFS to detect when the same content compressed
differently (combined with the ZSTD version header in the compressed
data), giving better compatibility as we upgrade ZSTD.


This project is sponsored by the FreeBSD Foundation.



On 2020-07-29 21:10, Allan Jude wrote:
> This is the sixth weekly status report on the project to integrate ZSTD=

> into OpenZFS.
>=20
> https://github.com/openzfs/zfs/pull/10631 - Improved the `zfs recv`
> error handling when it receives an out-of-bounds property value.
> Specifically, if a zfs send stream is created that supports a newer
> compression or checksum type, the property will fail to be set on the
> receiving system. This is fine, but `zfs recv` would abort() and create=

> a core file, rather than reporting the error, because it did not
> understand the EINVAL being returned for that property. In the case
> where the property is outside the accepted range, we now return the new=

> ZFS_ERR_BADPROP value, and the correct message is displayed to the user=
=2E
> I opted not to use ERANGE because that is used for 'this property value=

> should not be used on a root pool'. The idea is to get this fix merged
> into the 0.8.x branch for the next point release, to improve
> compatibility with streams generated by OpenZFS 2.0
>=20
>=20
> https://github.com/openzfs/zfs/pull/10632 - General improvement to erro=
r
> handling when the error code is EZFS_UNKNOWN.
>=20
>=20
> https://github.com/allanjude/zfs/commit/8f37c1ad8edaff20a550b3df07995da=
b80c06492
> - ZFS replication compatibility improvements. As discussed on the
> leadership call earlier this month, keep the compatibility simple. If
> the -c flag is given, send blocks compressed with any compression
> algorithm. The improved error handling will let the user know if their
> system can't handle ZSTD.
>=20
>=20
> https://github.com/allanjude/zfs/commit/0ffd80e281f79652973378599cd0332=
172f365bd
> - per-dataset feature activation. This switches the ZSTD feature flag
> from 'enabled' to 'active' as soon as the property is set, instead of
> when the first block is written. This ensures that the pool can't be
> imported on a system that does not support ZSTD that will cause the ZFS=

> cli tools to panic.
>=20
>=20
> I will be working on adding some tests for the feature activation.
>=20
> I've been looking at ways to add tests for the replication changes, but=

> it doesn't seem to be easy to test the results of a 'zfs recv' that doe=
s
> not know about ZSTD (where the values are outside of the valid range fo=
r
> the enum). If anyone has any ideas here, I'd be very interested.
>=20
>=20
> On 2020-07-20 23:40, Allan Jude wrote:
>> This is the fifth weekly status report on the project to integrate ZST=
D
>> into OpenZFS.
>>
>> https://github.com/c0d3z3r0/zfs/pull/14/commits/9807c99169e5931a754bb0=
df68267ffa2f289474
>> - Created a new test case to ensure that ZSTD compressed blocks surviv=
e
>> replication with the -c flag. We wanted to make sure the on-disk
>> compression header survived the trip.
>>
>> https://github.com/c0d3z3r0/zfs/pull/14/commits/94bef464fc304e9d6f5850=
391e41720c3955af11
>> - I split the zstd.c file into OS specific bits
>> (module/os/{linux,freebsd}/zstd_os.c) and also split the .h file into
>> zstd.h and zstd_impl.h. This was done so that FreeBSD can use the
>> existing kmem_cache mechanism, while Linux can use the vmem_alloc pool=

>> created in the earlier versions of this patch. I significantly changed=

>> the FreeBSD implementation from my earlier work, to reuse the power of=
 2
>> zio_data_buf_cache[]'s that already exist, only adding a few additiona=
l
>> kmem_caches for large blocks with high compression levels. This should=

>> avoid creating as many unnecessary kmem caches.
>>
>> https://github.com/c0d3z3r0/zfs/pull/14/commits/3d48243b77e6c8c3bf562c=
7a2315dd6cc571f28c
>> - Lastly, in my testing I was seeing a lot of hits on the new
>> compression failure kstat I added. This was caused by the ZFS "early
>> abort" feature, where we give the compressor an output buffer that is
>> smaller than the input, so it will fail if the block will not compress=

>> enough to be worth it. This helps avoid wasting CPU on uncompressible
>> blocks. However, it seems the 'one-file' version of zstd we are using
>> does not expose the ZSTD_ErrorCode enum. This needs to be investigated=

>> further to avoid issues if the value changes (although it is apparentl=
y
>> stable after version 1.3.1).
>>
>> I am still working on a solution for zfs send stream compatibility. I =
am
>> leaning towards creating a new flag, --zstd, to enable ZSTD compressed=

>> output. If the -e or -c flag are used without the --zstd flag, and the=

>> dataset has the zstd feature active, the idea would be to emit a warni=
ng
>> but send the blocks uncompressed, so that the stream remains compatibl=
e
>> with older versions of ZFS. I will be discussing this on the OpenZFS
>> Leadership call tomorrow, and am open to suggestions on how to best
>> handle this.
>>
>>
>> On 2020-07-14 22:26, Allan Jude wrote:
>>> In my continuing effort to complete the integration of ZSTD into
>>> OpenZFS, here is my fourth weekly status report:
>>>
>>> https://github.com/allanjude/zfs/commit/b0b1270d4e7835ecff41320830137=
5e3de2a4153
>>> - Create a new test case to make sure that the ZSTD header we write
>>> along with the data is correct. Verify that the physical size of the
>>> compressed data is less than the psize for the block pointer, and ver=
ify
>>> that the level matches. It uses a random level between 1 and 19 and t=
hen
>>> verifies with zdb that the block was compressed with that level.
>>>
>>> I am still working on a solution for setting the zstd feature flag to=

>>> 'active' as soon as it is set, rather than only once a block is born.=
 As
>>> well as fixing up compatibility around zfs send/recv with the embedde=
d
>>> block points flag.
>>>
>>> This project is sponsored by the FreeBSD Foundation.
>>>
>>>
>>
>=20
>=20


--=20
Allan Jude


--UkN47Zpoz8FxP3sgMe7uV1jD4vlSCfYuM--

--R8vzuaKLhzyCjf7STmUzjL8AXKZS0tooK
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQIcBAEBAgAGBQJfK2/NAAoJEBmVNT4SmAt+1NkQAIhyISCT+s2BErkniQp2h6vQ
x0ecSB4zIJt6dpK80GJ5lBYpgzOSFnSyvtfOItt4nXTbpa2nklyKiuNiyAP17P1W
zc/2A6M3hQBmCBBgZwEL8FA51RtEEZmosgMCF5XYI42WH0UFJZlmx1jogGZ9IOT8
AeK9UPGSwjOV+UZggJHqmnBkO4pAExnRZ4xBLM6154BT6vSBEG+eM7qO9L9uwCvZ
6NwSAroFclxA9TFjbVm1Tb+GLgipPsK64DI2yaPalML3C/uJ236LFWV7I4Ynvee6
GDQhR0Tf1XE1c7D+1SgXMQ0u3fPC703zJmJ3sUjZXhvFrhKsycaq3SzmGIJ9tchi
Qu36SXJ3W8BKOQb+a1VyYt8yuRZcpp1FLndDHZw+gmQiGBkQ4cQ9kzPYkHNA1uPv
XMOVVrF95jlYwJ8TOqd0S1uFReNvB3GwJHjm7aWYDbwhU2vReOWR5rYktS0moosG
BG5tDT20JyXq7LcutH979yivRM5SFltMw3VF66nPWPwDEeYWalCqFmt38YJz2iep
4Tx1g8+PDGTUt18kiPLy8gGvSyJdTG0s87rfDKSbiQEW9mnt4VM/MF2PKMInH3+Q
yyTlkXTrxv+19fdJvEybsRajrvO2gddG+Hg40FNPrwqBU0g24xxQFF/CIW4ijvNU
BNMSQjp9zGGXUeVqiBNr
=ryYQ
-----END PGP SIGNATURE-----

--R8vzuaKLhzyCjf7STmUzjL8AXKZS0tooK--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?db71835b-9bb7-2722-fd02-194b97f1564e>