Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 10 Nov 2023 12:05:15 -0800
From:      Cy Schubert <Cy.Schubert@cschubert.com>
To:        Xin LI <delphij@gmail.com>
Cc:        Martin Matuska <mm@freebsd.org>, d@delphij.net, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>, pjd@freebsd.org
Subject:   Re: Is 14.0 to released based on 0 for sysctl vfs.zfs.bclone_enabled ?
Message-ID:  <20231110120515.71f61f69@slippy>
In-Reply-To: <CAGMYy3tqwa_JQ01BKLVfSKt9N%2BWyK=M13j_i3kT=LJ_-MQyrQQ@mail.gmail.com>
References:  <2F81D978-7DBD-42CE-8ECF-C020B0CB5C29.ref@yahoo.com> <2F81D978-7DBD-42CE-8ECF-C020B0CB5C29@yahoo.com> <7a906956-6836-421e-b25e-ff701369e3ed@FreeBSD.org> <BBFDD30F-FB5D-44C8-ADA7-5B5AF859D86A@karels.net> <830CD3A8-DB62-418D-A7F7-8DA6CB46B1F5@yahoo.com> <05b493bc-94a5-4c78-bebf-5581addc5b7b@FreeBSD.org> <47c5b902-eea6-4194-b84a-99a6343f6bd0@delphij.net> <ba2e7bdc-68ba-4093-816a-2f0ea5bb6a07@FreeBSD.org> <CAGMYy3tqwa_JQ01BKLVfSKt9N%2BWyK=M13j_i3kT=LJ_-MQyrQQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Just a thought. When I was playing around with block_cloning last
summer I made sure to create a zpool checkpoint on the pool. When it
went horribly sideways restoring the checkpoint brought my pool back
into a sane state. Unfortunately whatever was written to it was lost.
That didn't matter as it was a play pool.

Something to think about.

--=20
Cheers,
Cy Schubert <Cy.Schubert@cschubert.com>
FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
NTP:           <cy@nwtime.org>    Web:  https://nwtime.org

			e^(i*pi)+1=3D0


On Fri, 10 Nov 2023 08:58:57 -0800
Xin LI <delphij@gmail.com> wrote:

> On Fri, Nov 10, 2023 at 7:50=E2=80=AFAM Martin Matuska <mm@freebsd.org> w=
rote:
>=20
> > Hi Xin,
> >
> > since when have you been using block cloning on the system? Is it
> > possible that there is already corrupted block-cloned data from the
>=20
>=20
> That's a good question, I can't 100% rule out this possibility.  I was
> following -CURRENT in ~weekly to ~monthly on that system, and the pool was
> created in March 2014.
>=20
> Do you think I should try rebuilding the pool from scratch?  I do have
> remote backup on a different server but was avoiding it because it's time
> consuming.
>=20
>=20
> > past? Is everything on one dataset or are you using multiple datasets
> > for /usr/src and /usr/obj?
> >
>=20
> /usr/src and /usr/obj are separate datasets, and the system runs Poudriere
> so it have multiple copies of slightly different /usr/src and /usr/obj's.
>=20
> Is there a way to identify datasets with block cloning, by the way?  Maybe
> I should try recreating these datasets first?
>=20
>=20
>=20
> >
> > Best regards,
> > mm
> >
> > On 10. 11. 2023 8:04, Xin Li wrote:
> > > On 2023-11-05 16:34, Martin Matuska wrote:
> > >> OpenZFS 2.2.0 in FreeBSD 14 fully supports block cloning. You can
> > >> work with pools that have feature@block_cloning enabled.
> > >> The sysctl variable vfs.zfs.bclone_enabled affects the behavior of
> > >> zfs_clone_range() which is called by copy_file_range(). When it is
> > >> set to 0, zfs_clone_range() does not do block cloning.
> > >> If it is set to anything else than 0, zfs_clone_range() does block
> > >> cloning (if all conditions are met - same ZFS pool, correct data
> > >> alignment, etc.).
> > >>
> > >> In FreeBSD-main, this tunable is enabled and I plan to enable it in
> > >> stable/14 somewhere around December 11, 2023.
> > >>
> > >> As of today I personally use block cloning on all my systems.
> > >
> > > I'd like to share a different data point.  It still panics on my
> > > storage (running -CURRENT about a week ago) when enabled and can be
> > > triggered by "make buildworld buildkernel".  I wasn't able to capture
> > > earlier coredump until the most recent one, which panicked with:
> > >
> > >
> > > cpuid =3D 2
> > > time =3D 1699593456
> > > KDB: stack backtrace:
> > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> > > 0xfffffe022f2bd7e0
> > > vpanic() at vpanic+0x132/frame 0xfffffe022f2bd910
> > > spl_panic() at spl_panic+0x3a/frame 0xfffffe022f2bd970
> > > dmu_brt_clone() at dmu_brt_clone+0x555/frame 0xfffffe022f2bd9e0
> > > zfs_clone_range() at zfs_clone_range+0xa4c/frame 0xfffffe022f2bdbb0
> > > zfs_freebsd_copy_file_range() at
> > > zfs_freebsd_copy_file_range+0x18a/frame 0xfffffe022f2bdc30
> > > vn_copy_file_range() at vn_copy_file_range+0x163/frame 0xfffffe022f2b=
dce0
> > > kern_copy_file_range() at kern_copy_file_range+0x380/frame
> > > 0xfffffe022f2bddb0
> > > sys_copy_file_range() at sys_copy_file_range+0x78/frame
> > > 0xfffffe022f2bde00
> > > amd64_syscall() at amd64_syscall+0x153/frame 0xfffffe022f2bdf30
> > > fast_syscall_common() at fast_syscall_common+0xf8/frame
> > > 0xfffffe022f2bdf30
> > > --- syscall (569, FreeBSD ELF64, copy_file_range), rip =3D
> > > 0x7fbb2da4ada, rsp =3D 0x7fbb02c5d48, rbp =3D 0x7fbb02c61e0 ---
> > > Uptime: 2h32m27s
> > > Dumping 7800 out of 32696
> > > MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> > >
> > > #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
> > > #1  doadump (textdump=3Dtextdump@entry=3D1) at
> > > /usr/src/sys/kern/kern_shutdown.c:405
> > > #2  0xffffffff80694480 in kern_reboot (howto=3D260) at
> > > /usr/src/sys/kern/kern_shutdown.c:526
> > > #3  0xffffffff8069497f in vpanic (fmt=3D0xffffffff82603415 "VERIFY3(n=
bps
> > > =3D=3D numbufs) failed (%llu =3D=3D %llu)\n", ap=3Dap@entry=3D0xfffff=
e022f2bd950)
> > > at /usr/src/sys/kern/kern_shutdown.c:970
> > > #4  0xffffffff8232999a in spl_panic (file=3D<optimized out>,
> > > func=3D<optimized out>, line=3D<unavailable>, fmt=3D<unavailable>) at
> > > /usr/src/sys/contrib/openzfs/module/os/freebsd/spl/spl_misc.c:103
> > > #5  0xffffffff823a6605 in dmu_brt_clone
> > > (os=3Dos@entry=3D0xfffff800c5ce4000, object=3D<optimized out>,
> > > offset=3Doffset@entry=3D0, length=3Dlength@entry=3D207477,
> > > tx=3Dtx@entry=3D0xfffff8071a108d00, bps=3Dbps@entry=3D0xfffffe01e218c=
000,
> > > nbps=3D2, replay=3D0)
> > >     at /usr/src/sys/contrib/openzfs/module/zfs/dmu.c:2303
> > > #6  0xffffffff8250f67c in zfs_clone_range (inzp=3D0xfffff804416ac000,
> > > inoffp=3D0xfffff800b81cb048, outzp=3D0xfffff806f58f03a0,
> > > outoffp=3D0xfffff800b8063048, lenp=3Dlenp@entry=3D0xfffffe022f2bdbf0,
> > > cr=3D0xfffff8000a6fe600)
> > >     at /usr/src/sys/contrib/openzfs/module/zfs/zfs_vnops.c:1326
> > > #7  0xffffffff8234b3ba in zfs_freebsd_copy_file_range
> > > (ap=3D0xfffffe022f2bdc48) at
> > > /usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:6294
> > > #8  0xffffffff8079f443 in VOP_COPY_FILE_RANGE
> > > (invp=3D0xfffff804416cb1c0, inoffp=3D0xfffff800b81cb048,
> > > outvp=3D0xfffff806f51d3380, outoffp=3D0xfffff800b8063048,
> > > lenp=3D0xfffffe022f2bdd48, incred=3D0xfffff8000a6fe600, flags=3D<opti=
mized
> > > out>,
> > >     outcred=3D<optimized out>, fsizetd=3D<optimized out>) at
> > > ./vnode_if.h:2385
> > > #9  vn_copy_file_range (invp=3Dinvp@entry=3D0xfffff804416cb1c0,
> > > inoffp=3Dinoffp@entry=3D0xfffff800b81cb048,
> > > outvp=3Doutvp@entry=3D0xfffff806f51d3380,
> > > outoffp=3Doutoffp@entry=3D0xfffff800b8063048,
> > > lenp=3Dlenp@entry=3D0xfffffe022f2bdd48, flags=3Dflags@entry=3D0,
> > >     incred=3D0xfffff8000a6fe600, outcred=3D0xfffff8000a6fe600,
> > > fsize_td=3D0xfffffe022925b3a0) at /usr/src/sys/kern/vfs_vnops.c:3087
> > > #10 0xffffffff8079a070 in kern_copy_file_range
> > > (td=3Dtd@entry=3D0xfffffe022925b3a0, infd=3D<optimized out>,
> > > inoffp=3D0xfffff800b81cb048, inoffp@entry=3D0x0, outfd=3D<optimized o=
ut>,
> > > outoffp=3D0xfffff800b8063048, outoffp@entry=3D0x0, len=3D922337203685=
4775807,
> > >     flags=3D0) at /usr/src/sys/kern/vfs_syscalls.c:4973
> > > #11 0xffffffff8079a178 in sys_copy_file_range (td=3D0xfffffe022925b3a=
0,
> > > uap=3D0xfffffe022925b7a0) at /usr/src/sys/kern/vfs_syscalls.c:5011
> > > #12 0xffffffff80a97aa3 in syscallenter (td=3D0xfffffe022925b3a0) at
> > > /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:188
> > > #13 amd64_syscall (td=3D0xfffffe022925b3a0, traced=3D0) at
> > > /usr/src/sys/amd64/amd64/trap.c:1194
> > > #14 <signal handler called>
> > > #15 0x000007fbb2da4ada in ?? ()
> > >
> > >
> > > and disabling bclone does appear to allow me to finish buildworld /
> > > buildkernel.
> > >
> > > The pool didn't have redaction_list_spill enabled.
> > >
> > > The ASSERT3U(nbps, =3D=3D, numbufs); in dmu_brt_clone was added when =
block
> > > clone is first implemented.
> > >
> > > It seems that I am the only person who is seeing this as of today.  It
> > > seems that block clone was indeed being used for some data:
> > >
> > > saturn  bcloneused 1.18M                          -
> > > saturn  bclonesaved 1.21M                          -
> > > saturn  bcloneratio 2.02x                          -
> > >
> > > The pool have dedup enabled for some datasets.
> > >
> > > Any suggestions?  (In extreme cases I can recreate the storage pool
> > > from backup or copy the data somewhere else, then recreate the pool,
> > > then copy data back, but I'd like to avoid that if possible)
> > >
> > > Cheers,
> > >
> > >
> > >>
> > >> mm
> > >>
> > >> On 04/11/2023 13:35, Mark Millard wrote:
> > >>> On Nov 4, 2023, at 04:38, Mike Karels <mike@karels.net> wrote:
> > >>>
> > >>>> On 4 Nov 2023, at 4:01, Ronald Klop wrote:
> > >>>>
> > >>>>> On 11/4/23 02:39, Mark Millard wrote:
> > >>>>>> It looks to me like releng/14.0 (as of 14.0-RC4) still has:
> > >>>>>>
> > >>>>>> int zfs_bclone_enabled;
> > >>>>>> SYSCTL_INT(_vfs_zfs, OID_AUTO, bclone_enabled, CTLFLAG_RWTUN,
> > >>>>>> &zfs_bclone_enabled, 0, "Enable block cloning");
> > >>>>>>
> > >>>>>> leaving block cloning effectively disabled by default, no
> > >>>>>> matter what the pool has enabled.
> > >>>>>>
> > >>>>>> https://www.freebsd.org/releases/14.0R/relnotes/ also reports:
> > >>>>>>
> > >>>>>> QUOTE
> > >>>>>> OpenZFS has been upgraded to version 2.2. New features include:
> > >>>>>> =E2=80=A2
> > >>>>>> block cloning, which allows shallow copies of blocks in file
> > >>>>>> copies. This is optional, and disabled by default; it can be
> > >>>>>> enabled with sysctl vfs.zfs.bclone_enabled=3D1.
> > >>>>>> END QUOTE
> > >>>>>>
> > >>>>>
> > >>>>> I think this answers your question in the subject.
> > >>>> I think so too (and I wrote that text).
> > >>> Thanks for the confirmation of the final intent.
> > >>>
> > >>> I believe this makes:
> > >>>
> > >>> QUOTE
> > >>> author Brian Behlendorf <behlendorf1@llnl.gov> 2023-05-25 20:53:08
> > >>> +0000
> > >>> committer GitHub <noreply@github.com> 2023-05-25 20:53:08 +0000
> > >>> commit 91a2325c4a0fbe01d0bf212e44fa9d85017837ce (patch)
> > >>> tree dd01dfce6aeef357ade1775acf18aade535c6271
> > >>> . . .
> > >>> Update compatibility.d files
> > >>>
> > >>> Add an openzfs-2.2 compatibility file for the next release. Edon-R
> > >>> support has been enabled for FreeBSD removing the need for different
> > >>> FreeBSD and Linux files. Symlinks for the -linux and -freebsd names
> > >>> are created for any scripts expecting that convention. Additionally,
> > >>> a symlink for ubunutu-22.04 was added. Signed-off-by: Brian
> > >>> Behlendorf <behlendorf1@llnl.gov> Closes #14833
> > >>> END QUOTE
> > >>>
> > >>> technically incorrect in that compatibility.d/openzfs-2.2-freebsd
> > >>> should be distinct in content from compatibility.d/openzfs-2.2 so
> > >>> that block cloning would not be enabled.
> > >>>
> > >>>
> > >>>>>> Just curiousity on my part about the default completeness of
> > >>>>>> openzfs-2.2 support, not an objection either way.
> > >>>>>>
> > >>>>>
> > >>>>> I haven't seen new issues with block cloning in the last few weeks
> > >>>>> mentioned on the mailing lists. All known issues are fixed AFAIK.
> > >>>>> But I can imagine that the risk+effect ratio of data corruption is
> > >>>>> seen as a bit too high for a 14.0 release for this particular
> > >>>>> feature. That does not diminish the rest of the completeness of
> > >>>>> openzfs-2.2.
> > >>>>>
> > >>>>> NB: I'm not involved in developing openzfs or the decision making
> > >>>>> in the release. Just repeating what I read on the lists.
> > >>>> There was another block cloning fix in 14.0-RC4; see the commit lo=
g.
> > >>>> Maybe there will be no more issues, but it seems that corner cases
> > >>>> were
> > >>>> still being found recently.
> > >>> Looks like I'll stay at openzfs-2.1 pool features until there is
> > >>> a release that no longer has the default status:
> > >>>
> > >>> 0 for sysctl vfs.zfs.bclone_enabled
> > >>>
> > >>> I use main [so: 15 now] but only enable openzfs-2.* pool features
> > >>> supported by default on some FreeBSD release, that has an accurate
> > >>> compatibility.d/openzfs-2.*-freebsd file.
> > >>>
> > >>> =3D=3D=3D
> > >>> Mark Millard
> > >>> marklmi at yahoo.com
> > >>>
> > >>>
> > >>
> > >
> >
> >




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20231110120515.71f61f69>