Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Apr 2023 06:45:01 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        =?utf-8?B?Sm9zw6kgUMOpcmV6?= <fbl@aoek.com>, Pawel Jakub Dawidek <pjd@freebsd.org>, Current FreeBSD <freebsd-current@freebsd.org>
Subject:   Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
Message-ID:  <61987FE2-BE4E-45C5-A731-C7C6EED4D875@yahoo.com>
References:  <61987FE2-BE4E-45C5-A731-C7C6EED4D875.ref@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Jos=C3=A9_P=C3=A9rez <fbl_at_aoek.com> wrote on
Date: Mon, 17 Apr 2023 12:28:40 UTC :

> El 2023-04-17 12:43, Pawel Jakub Dawidek escribi=C3=B3:
> > On 4/17/23 18:15, Pawel Jakub Dawidek wrote:
> >> There were three issues that I know of after the recent OpenZFS =
merge:
> >>=20
> >> 1. Data corruption unrelated to block cloning, so it can happen =
even=20
> >> with block cloning disabled or not in use. This was the problematic=20=

> >> commit:
> >>=20
> >>     =
https://github.com/openzfs/zfs/commit/519851122b1703b8445ec17bc89b347cea96=
5bb9
> >>=20
> >> It was reverted in 63ee747febbf024be0aace61161241b53245449e.
> >>=20
> >> 2. Data corruption with embedded blocks when block cloning is =
enabled.=20
> >> It can happen when compression is enabled and the block contains=20
> >> between 60 to 112 bytes (this might be hard to determine). Fix =
exists,=20
> >> it is merged to OpenZFS already, but isn't in FreeBSD yet.
> >>     OpenZFS pull request: https://github.com/openzfs/zfs/pull/14739
> >>=20
> >> 3. Panic on VERIFY(zil_replaying(zfsvfs->z_log, tx)). This is=20
> >> triggered when block cloning is enabled, the sync property is set =
to=20
> >> disabled and copy_file_range(2) is used. Easy fix exists, it is not=20=

> >> yet merged to OpenZFS and not yet in FreeBSD HEAD.
> >>     OpenZFS pull request: https://github.com/openzfs/zfs/pull/14758
> >>=20
> >> Block cloning was disabled in=20
> >> 46ac8f2e7d9601311eb9b3cd2fed138ff4a11a66, so 2 and 3 should not =
occur.
> >=20
> > As of 068913e4ba3dd9b3067056e832cefc5ed264b5cc all known issues are
> > fixed, as far as I can tell.
> >=20
> > Block cloning remains disabled for now just to be on the safe side,
> > but can be enabled by setting sysctl vfs.zfs.bclone_enabled to 1.
> >=20
> > Don't relay on this sysctl as it will be removed in 2-3 weeks.
>=20
> Hi Pawel,
> thank you for your reply and for the fixes.
>=20
> I think there is a 4th issue that needs to be addressed: how do we=20
> recover from the worst case scenario which is a machine with a kernel =
>=20
> 2a58b312b62f and ZFS root upgraded with block cloning enabled.
>=20
> In particular, is it safe to turn such a machine on in the first =
place,=20
> and what are the risks involved in doing so? Any potential data loss?
>=20
> Would such a machine be able to fix itself by compiling a kernel, or=20=

> would compilation fail and might data be corrupted in the process?
>=20
> I have two poudriere builders powered off (I am not alone in this=20
> situation) and I need to recover them, ideally minimizing data loss. =
The=20
> builders are also hosting current and used to build kernels and worlds=20=

> for 13 and current: as of now all my production machines are stuck on=20=

> the 13 they run, I cannot update binaries nor packages and I would =
like=20
> to be back online.
>=20
> Whatever the fixing procedure, it shall be outlined in the UPDATING=20
> document.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D270811 is an example
issue where a FreeBSD powerpc package building server can not boot
--after patching so it no longer gets a boot time "panic: floating-point
unavailable trap" (that jhibbits patch is still not committed):

QUOTE from the description:
. . .
nda1: 953869MB (1953525168 512 byte sectors)
GEOM_MIRROR: Device mirror/swap0 launched (2/2).
Mounting from zfs:zroot failed with error 6; retrying for 3 more seconds
Mounting from zfs:zroot failed with error 6.

Loader variables:
vfs.root.mountfrom=3Dzfs:zroot

Manual root filesystem specification:
<fstype>:<device> [options]
Mount <device> using filesystem <fstype>
and with the specified (optional) option list.

eg. ufs:/dev/da0s1a
zfs:zroot/ROOT/default
cd9660:/dev/cd0 ro
(which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /)

? List valid disk boot devices
. Yield 1 second (for background tasks)
<empty line> Abort manual input

mountroot>

This machine is part of the FreeBSD cluster for building PowerPC =
packages,
so we can build kernels to test anytime necessary.
END  QUOTE

=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?61987FE2-BE4E-45C5-A731-C7C6EED4D875>