Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Apr 2023 14:28:40 +0200
From:      =?UTF-8?Q?Jos=C3=A9_P=C3=A9rez?= <fbl@aoek.com>
To:        Pawel Jakub Dawidek <pjd@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
Message-ID:  <a45ea1f22a59a88e65790b81ebce9c73@mail.yourbox.net>
In-Reply-To: <0164e42a-e7cd-a1e8-295c-21f414edf67b@dawidek.net>
References:  <20230413071032.18BFF31F@slippy.cwsent.com> <20230413063321.60344b1f@cschubert.com> <CAGudoHG3rCx93gyJTmzTBnSe4fQ9=m4mBESWbKVWtAGRxen_4w@mail.gmail.com> <20230413135635.6B62F354@slippy.cwsent.com> <c41f9ed6-e557-9255-5a46-1a22d4b32d66@dawidek.net> <319a267e-3f76-3647-954a-02178c260cea@dawidek.net> <b60807e9-f393-6e6d-3336-042652ddd03c@freebsd.org> <441db213-2abb-b37e-e5b3-481ed3e00f96@dawidek.net> <5ce72375-90db-6d30-9f3b-a741c320b1bf@freebsd.org> <99382FF7-765C-455F-A082-C47DB4D5E2C1@yahoo.com> <32cad878-726c-4562-0971-20d5049c28ad@freebsd.org> <ABC9F3DB-289E-455E-AF43-B3C13525CB2C@yahoo.com> <20230415115452.08911bb7@thor.intern.walstatt.dynvpn.de> <20230415143625.99388387@slippy.cwsent.com> <20230415175218.777d0a97@thor.intern.walstatt.dynvpn.de> <6792aded-6e2e-a118-259d-0df0f80c361c@smeets.xyz> <80ea8a67-9b64-c723-6d97-21cfa127ae43@dawidek.net> <b3d8b8f7a35312b1211b76b111c01242@mail.yourbox.net> <01430095-33a3-a949-3772-2ec90b4c3fe6@dawidek.net> <0164e42a-e7cd-a1e8-295c-21f414edf67b@dawidek.net>

next in thread | previous in thread | raw e-mail | index | archive | help
El 2023-04-17 12:43, Pawel Jakub Dawidek escribió:
> On 4/17/23 18:15, Pawel Jakub Dawidek wrote:
>> There were three issues that I know of after the recent OpenZFS merge:
>> 
>> 1. Data corruption unrelated to block cloning, so it can happen even 
>> with block cloning disabled or not in use. This was the problematic 
>> commit:
>>  
>>     https://github.com/openzfs/zfs/commit/519851122b1703b8445ec17bc89b347cea965bb9
>> 
>> It was reverted in 63ee747febbf024be0aace61161241b53245449e.
>> 
>> 2. Data corruption with embedded blocks when block cloning is enabled. 
>> It can happen when compression is enabled and the block contains 
>> between 60 to 112 bytes (this might be hard to determine). Fix exists, 
>> it is merged to OpenZFS already, but isn't in FreeBSD yet.
>>      OpenZFS pull request: https://github.com/openzfs/zfs/pull/14739
>> 
>> 3. Panic on VERIFY(zil_replaying(zfsvfs->z_log, tx)). This is 
>> triggered when block cloning is enabled, the sync property is set to 
>> disabled and copy_file_range(2) is used. Easy fix exists, it is not 
>> yet merged to OpenZFS and not yet in FreeBSD HEAD.
>>      OpenZFS pull request: https://github.com/openzfs/zfs/pull/14758
>> 
>> Block cloning was disabled in 
>> 46ac8f2e7d9601311eb9b3cd2fed138ff4a11a66, so 2 and 3 should not occur.
> 
> As of 068913e4ba3dd9b3067056e832cefc5ed264b5cc all known issues are
> fixed, as far as I can tell.
> 
> Block cloning remains disabled for now just to be on the safe side,
> but can be enabled by setting sysctl vfs.zfs.bclone_enabled to 1.
> 
> Don't relay on this sysctl as it will be removed in 2-3 weeks.

Hi Pawel,
thank you for your reply and for the fixes.

I think there is a 4th issue that needs to be addressed: how do we 
recover from the worst case scenario which is a machine with a kernel > 
2a58b312b62f and ZFS root upgraded with block cloning enabled.

In particular, is it safe to turn such a machine on in the first place, 
and what are the risks involved in doing so? Any potential data loss?

Would such a machine be able to fix itself by compiling a kernel, or 
would compilation fail and might data be corrupted in the process?

I have two poudriere builders powered off (I am not alone in this 
situation) and I need to recover them, ideally minimizing data loss. The 
builders are also hosting current and used to build kernels and worlds 
for 13 and current: as of now all my production machines are stuck on 
the 13 they run, I cannot update binaries nor packages and I would like 
to be back online.

Whatever the fixing procedure, it shall be outlined in the UPDATING 
document.

Thank you.

BR,

-- 
José Pérez



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a45ea1f22a59a88e65790b81ebce9c73>