Date: Mon, 17 Apr 2023 06:45:01 -0700 From: Mark Millard <marklmi@yahoo.com> To: =?utf-8?B?Sm9zw6kgUMOpcmV6?= <fbl@aoek.com>, Pawel Jakub Dawidek <pjd@freebsd.org>, Current FreeBSD <freebsd-current@freebsd.org> Subject: Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75 Message-ID: <61987FE2-BE4E-45C5-A731-C7C6EED4D875@yahoo.com> References: <61987FE2-BE4E-45C5-A731-C7C6EED4D875.ref@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Jos=C3=A9_P=C3=A9rez <fbl_at_aoek.com> wrote on Date: Mon, 17 Apr 2023 12:28:40 UTC : > El 2023-04-17 12:43, Pawel Jakub Dawidek escribi=C3=B3: > > On 4/17/23 18:15, Pawel Jakub Dawidek wrote: > >> There were three issues that I know of after the recent OpenZFS = merge: > >>=20 > >> 1. Data corruption unrelated to block cloning, so it can happen = even=20 > >> with block cloning disabled or not in use. This was the problematic=20= > >> commit: > >>=20 > >> = https://github.com/openzfs/zfs/commit/519851122b1703b8445ec17bc89b347cea96= 5bb9 > >>=20 > >> It was reverted in 63ee747febbf024be0aace61161241b53245449e. > >>=20 > >> 2. Data corruption with embedded blocks when block cloning is = enabled.=20 > >> It can happen when compression is enabled and the block contains=20 > >> between 60 to 112 bytes (this might be hard to determine). Fix = exists,=20 > >> it is merged to OpenZFS already, but isn't in FreeBSD yet. > >> OpenZFS pull request: https://github.com/openzfs/zfs/pull/14739 > >>=20 > >> 3. Panic on VERIFY(zil_replaying(zfsvfs->z_log, tx)). This is=20 > >> triggered when block cloning is enabled, the sync property is set = to=20 > >> disabled and copy_file_range(2) is used. Easy fix exists, it is not=20= > >> yet merged to OpenZFS and not yet in FreeBSD HEAD. > >> OpenZFS pull request: https://github.com/openzfs/zfs/pull/14758 > >>=20 > >> Block cloning was disabled in=20 > >> 46ac8f2e7d9601311eb9b3cd2fed138ff4a11a66, so 2 and 3 should not = occur. > >=20 > > As of 068913e4ba3dd9b3067056e832cefc5ed264b5cc all known issues are > > fixed, as far as I can tell. > >=20 > > Block cloning remains disabled for now just to be on the safe side, > > but can be enabled by setting sysctl vfs.zfs.bclone_enabled to 1. > >=20 > > Don't relay on this sysctl as it will be removed in 2-3 weeks. >=20 > Hi Pawel, > thank you for your reply and for the fixes. >=20 > I think there is a 4th issue that needs to be addressed: how do we=20 > recover from the worst case scenario which is a machine with a kernel = >=20 > 2a58b312b62f and ZFS root upgraded with block cloning enabled. >=20 > In particular, is it safe to turn such a machine on in the first = place,=20 > and what are the risks involved in doing so? Any potential data loss? >=20 > Would such a machine be able to fix itself by compiling a kernel, or=20= > would compilation fail and might data be corrupted in the process? >=20 > I have two poudriere builders powered off (I am not alone in this=20 > situation) and I need to recover them, ideally minimizing data loss. = The=20 > builders are also hosting current and used to build kernels and worlds=20= > for 13 and current: as of now all my production machines are stuck on=20= > the 13 they run, I cannot update binaries nor packages and I would = like=20 > to be back online. >=20 > Whatever the fixing procedure, it shall be outlined in the UPDATING=20 > document. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D270811 is an example issue where a FreeBSD powerpc package building server can not boot --after patching so it no longer gets a boot time "panic: floating-point unavailable trap" (that jhibbits patch is still not committed): QUOTE from the description: . . . nda1: 953869MB (1953525168 512 byte sectors) GEOM_MIRROR: Device mirror/swap0 launched (2/2). Mounting from zfs:zroot failed with error 6; retrying for 3 more seconds Mounting from zfs:zroot failed with error 6. Loader variables: vfs.root.mountfrom=3Dzfs:zroot Manual root filesystem specification: <fstype>:<device> [options] Mount <device> using filesystem <fstype> and with the specified (optional) option list. eg. ufs:/dev/da0s1a zfs:zroot/ROOT/default cd9660:/dev/cd0 ro (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /) ? List valid disk boot devices . Yield 1 second (for background tasks) <empty line> Abort manual input mountroot> This machine is part of the FreeBSD cluster for building PowerPC = packages, so we can build kernels to test anytime necessary. END QUOTE =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?61987FE2-BE4E-45C5-A731-C7C6EED4D875>