Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 Apr 2023 22:22:57 -0700
From:      Rick Macklem <rick.macklem@gmail.com>
To:        Cy Schubert <Cy.Schubert@cschubert.com>
Cc:        Martin Matuska <mm@freebsd.org>, src-committers@freebsd.org,  dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org
Subject:   Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
Message-ID:  <CAM5tNy6qgn1XGw4qEO=R=FD4kzR_3t=C27LmU4YY59p8jAZqew@mail.gmail.com>
In-Reply-To: <20230404050740.6C25E2AA@slippy.cwsent.com>
References:  <202304031513.333FD6qw014903@gitrepo.freebsd.org> <20230404040010.5D073224@slippy.cwsent.com> <CAM5tNy4p5UbAkujksyTR=J0tGuDf=Y6v2LuctK4P_p-H3PZpPg@mail.gmail.com> <20230404050740.6C25E2AA@slippy.cwsent.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Apr 3, 2023 at 10:07=E2=80=AFPM Cy Schubert <Cy.Schubert@cschubert.=
com> wrote:
>
> In message <CAM5tNy4p5UbAkujksyTR=3DJ0tGuDf=3DY6v2LuctK4P_p-H3PZpPg@mail.=
gmail.c
> om>
> , Rick Macklem writes:
> > On Mon, Apr 3, 2023 at 9:00=3DE2=3D80=3DAFPM Cy Schubert <Cy.Schubert@c=
schubert.c=3D
> > om> wrote:
> > >
> > > In message <202304031513.333FD6qw014903@gitrepo.freebsd.org>, Martin
> > > Matuska wr
> > > ites:
> > > > The branch main has been updated by mm:
> > > >
> > > > URL: https://cgit.FreeBSD.org/src/commit/?id=3D3D2a58b312b62f908ec9=
2311d1=3D
> > bd8536db
> > > > aeb8e55b
> > > >
> > > > commit 2a58b312b62f908ec92311d1bd8536dbaeb8e55b
> > > > Merge: b98fbf3781df 431083f75bdd
> > > > Author:     Martin Matuska <mm@FreeBSD.org>
> > > > AuthorDate: 2023-04-03 14:49:30 +0000
> > > > Commit:     Martin Matuska <mm@FreeBSD.org>
> > > > CommitDate: 2023-04-03 14:49:30 +0000
> > > >
> > > >     zfs: merge openzfs/zfs@431083f75
> > > >
> > > >     Notable upstream pull request merges:
> > > >       #12194 Fix short-lived txg caused by autotrim
> > > >       #13368 ZFS_IOC_COUNT_FILLED does unnecessary txg_wait_synced(=
)
> > > >       #13392 Implementation of block cloning for ZFS
> > > >       #13741 SHA2 reworking and API for iterating over multiple imp=
leme=3D
> > ntatio
> > > > ns
> > > >       #14282 Sync thread should avoid holding the spa config write =
lock
> > > >              when possible
> > > >       #14283 txg_sync should handle write errors in ZIL
> > > >       #14359 More adaptive ARC eviction
> > > >       #14469 Fix NULL pointer dereference in zio_ready()
> > > >       #14479 zfs redact fails when dnodesize=3D3Dauto
> > > >       #14496 improve error message of zfs redact
> > > >       #14500 Skip memory allocation when compressing holes
> > > >       #14501 FreeBSD: don't verify recycled vnode for zfs control d=
irec=3D
> > tory
> > > >       #14502 partially revert PR 14304 (eee9362a7)
> > > >       #14509 Fix per-jail zfs.mount_snapshot setting
> > > >       #14514 Fix data race between zil_commit() and zil_suspend()
> > > >       #14516 System-wide speculative prefetch limit
> > > >       #14517 Use rw_tryupgrade() in dmu_bonus_hold_by_dnode()
> > > >       #14519 Do not hold spa_config in ZIL while blocked on IO
> > > >       #14523 Move dmu_buf_rele() after dsl_dataset_sync_done()
> > > >       #14524 Ignore too large stack in case of dsl_deadlist_merge
> > > >       #14526 Use .section .rodata instead of .rodata on FreeBSD
> > > >       #14528 ICP: AES-GCM: Refactor gcm_clear_ctx()
> > > >       #14529 ICP: AES-GCM: Unify gcm_init_ctx() and gmac_init_ctx()
> > > >       #14532 Handle unexpected errors in zil_lwb_commit() without A=
SSER=3D
> > T()
> > > >       #14544 icp: Prevent compilers from optimizing away memset()
> > > >              in gcm_clear_ctx()
> > > >       #14546 Revert zfeature_active() to static
> > > >       #14556 Remove bad kmem_free() oversight from previous zfsdev_=
stat=3D
> > e_list
> > > >              patch
> > > >       #14563 Optimize the is_l2cacheable functions
> > > >       #14565 FreeBSD: zfs_znode_alloc: lock the vnode earlier
> > > >       #14566 FreeBSD: fix false assert in cache_vop_rmdir when repl=
ayin=3D
> > g ZIL
> > > >       #14567 spl: Add cmn_err_once() to log a message only on the f=
irst=3D
> >  call
> > > >       #14568 Fix incremental receive silently failing for recursive=
 sen=3D
> > ds
> > > >       #14569 Restore ASMABI and other Unify work
> > > >       #14576 Fix detection of IBM Power8 machines (ISA 2.07)
> > > >       #14577 Better handling for future crypto parameters
> > > >       #14600 zcommon: Refactor FPU state handling in fletcher4
> > > >       #14603 Fix prefetching of indirect blocks while destroying
> > > >       #14633 Fixes in persistent error log
> > > >       #14639 FreeBSD: Remove extra arc_reduce_target_size() call
> > > >       #14641 Additional limits on hole reporting
> > > >       #14649 Drop lying to the compiler in the fletcher4 code
> > > >       #14652 panic loop when removing slog device
> > > >       #14653 Update vdev state for spare vdev
> > > >       #14655 Fix cloning into already dirty dbufs
> > > >       #14678 Revert "Do not hold spa_config in ZIL while blocked on=
 IO"
> > > >
> > > >     Obtained from:  OpenZFS
> > > >     OpenZFS commit: 431083f75bdd3efaee992bdd672625ec7240d252
> > >
> > > Another problem related to copy_file_range() is the following example=
.
> > >
> > > slippy$ df -h build/make/Makefile .
> > > Filesystem        Size    Used   Avail Capacity  Mounted on
> > > t/wrkdir/amd64     52G     53M     52G     0%    /export/wrkdir/amd64
> > > t/wrkdir/amd64     52G     53M     52G     0%    /export/wrkdir/amd64
> > > slippy$ cp build/make/Makefile .
> > > cp: build/make/Makefile: Cross-device link
> > > slippy$
> > >
> > > And,
> > >
> > > slippy$ cp y4menc.h foobar
> > > cp: y4menc.h: Cross-device link
> > > slippy$
> > >
> > > But the following works because /tmp is mfs and /var/tmp is ufs.
> > >
> > > slippy$ cp y4menc.h /tmp
> > > slippy$
> > > slippy$ cp y4menc.h /var/tmp
> > > slippy$
> > >
> > > Copying files from one zpool to a dataset in another zpool also works=
 as
> > > does copying files from different datasets on the same zpool. Only co=
pyin=3D
> > g
> > > files from/to the same dataset results in cross device link.
> > >
> > zfs_copy_file_range() will only be called if the source and destination
> > both exist on the same mount point. (I'm guessing that's what you mean
> > by "same dataset".)
>
> Correct.
>
> The long term solution to the problem (for users) would be to enable
> block_cloning. One can test this by testing, checkpoint the pool, enable
> block_cloning, problem goes away, export and import the pool with
> --rewind-to-checkpoint to look at the regression again.
>
> As to why we see a Cross-device link,
>
> dtrace: script './dtrace.d' matched 1 probe
> CPU     ID                    FUNCTION:NAME
>   3  26104        vn_copy_file_range:return int64_t 0x12
>
> 0x12 =3D=3D EXDEV.
>
> As to why, zfs_clone_range() is called unconditionally at
> zfs/zfs_vnops_os.c:6292. If the feature is not enabled zfs_clone_range()
> sets the return code to EXDEV at line 1083 of zfs_vnops.c. Therefore
> block_cloning MUST be enabled.
>
> To reiterate, the long term solution would be to enable block_cloning. Fo=
r
> those who either wish or need to defer enabling of block_cloning, because
> the zpool is occasionally imported by older versions of freebsd requiring
> write acess, or other reason.
>
If there is a way to test to see if block cloning is enabled at the beginni=
ng
of zfs_copy_file_range(), it should just call vn_generic_copy_file_range()
if it is not enabled (or something like that).

rick

>
> --
> Cheers,
> Cy Schubert <Cy.Schubert@cschubert.com>
> FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
> NTP:           <cy@nwtime.org>    Web:  https://nwtime.org
>
>                         e^(i*pi)+1=3D0
>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy6qgn1XGw4qEO=R=FD4kzR_3t=C27LmU4YY59p8jAZqew>