Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Apr 2023 07:18:42 -0700
From:      Rick Macklem <rick.macklem@gmail.com>
To:        Martin Matuska <mm@freebsd.org>
Cc:        Cy Schubert <Cy.Schubert@cschubert.com>, src-committers@freebsd.org,  dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org,  Pawel Dawidek <pjd@freebsd.org>
Subject:   Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
Message-ID:  <CAM5tNy7J0OYjkza02BBBigO9Cw6r%2BYVGtcgJ9StdAMwiuToC2g@mail.gmail.com>
In-Reply-To: <e2c6952f-7e7c-c1f3-c8c5-46b5ba1efeb1@FreeBSD.org>
References:  <202304031513.333FD6qw014903@gitrepo.freebsd.org> <20230404040010.5D073224@slippy.cwsent.com> <CAM5tNy4p5UbAkujksyTR=J0tGuDf=Y6v2LuctK4P_p-H3PZpPg@mail.gmail.com> <20230404050740.6C25E2AA@slippy.cwsent.com> <CAM5tNy6qgn1XGw4qEO=R=FD4kzR_3t=C27LmU4YY59p8jAZqew@mail.gmail.com> <e2c6952f-7e7c-c1f3-c8c5-46b5ba1efeb1@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000e3720005f8835abf
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Tue, Apr 4, 2023 at 3:55=E2=80=AFAM Martin Matuska <mm@freebsd.org> wrot=
e:
>
> I have opened a PR for this (includes patch):
> https://github.com/openzfs/zfs/pull/14713
>
> We still need to address the a_fsizetd problem.
>
> Maybe
I'd say this is a definite maybe.
I've attached what I think is the correct fix.
a_outcred is the credentials used to open the output file for writing.
Since the only use of the zfs_clone_range() "cr" argument is to decide
if a SUID/SGID bit(s) should be cleared, I think this is the correct cred
to use.

rick

>
> On 4. 4. 2023 7:22, Rick Macklem wrote:
> > On Mon, Apr 3, 2023 at 10:07=E2=80=AFPM Cy Schubert <Cy.Schubert@cschub=
ert.com> wrote:
> >> In message <CAM5tNy4p5UbAkujksyTR=3DJ0tGuDf=3DY6v2LuctK4P_p-H3PZpPg@ma=
il.gmail.c
> >> om>
> >> , Rick Macklem writes:
> >>> On Mon, Apr 3, 2023 at 9:00=3DE2=3D80=3DAFPM Cy Schubert <Cy.Schubert=
@cschubert.c=3D
> >>> om> wrote:
> >>>> In message <202304031513.333FD6qw014903@gitrepo.freebsd.org>, Martin
> >>>> Matuska wr
> >>>> ites:
> >>>>> The branch main has been updated by mm:
> >>>>>
> >>>>> URL: https://cgit.FreeBSD.org/src/commit/?id=3D3D2a58b312b62f908ec9=
2311d1=3D
> >>> bd8536db
> >>>>> aeb8e55b
> >>>>>
> >>>>> commit 2a58b312b62f908ec92311d1bd8536dbaeb8e55b
> >>>>> Merge: b98fbf3781df 431083f75bdd
> >>>>> Author:     Martin Matuska <mm@FreeBSD.org>
> >>>>> AuthorDate: 2023-04-03 14:49:30 +0000
> >>>>> Commit:     Martin Matuska <mm@FreeBSD.org>
> >>>>> CommitDate: 2023-04-03 14:49:30 +0000
> >>>>>
> >>>>>      zfs: merge openzfs/zfs@431083f75
> >>>>>
> >>>>>      Notable upstream pull request merges:
> >>>>>        #12194 Fix short-lived txg caused by autotrim
> >>>>>        #13368 ZFS_IOC_COUNT_FILLED does unnecessary txg_wait_synced=
()
> >>>>>        #13392 Implementation of block cloning for ZFS
> >>>>>        #13741 SHA2 reworking and API for iterating over multiple im=
pleme=3D
> >>> ntatio
> >>>>> ns
> >>>>>        #14282 Sync thread should avoid holding the spa config write=
 lock
> >>>>>               when possible
> >>>>>        #14283 txg_sync should handle write errors in ZIL
> >>>>>        #14359 More adaptive ARC eviction
> >>>>>        #14469 Fix NULL pointer dereference in zio_ready()
> >>>>>        #14479 zfs redact fails when dnodesize=3D3Dauto
> >>>>>        #14496 improve error message of zfs redact
> >>>>>        #14500 Skip memory allocation when compressing holes
> >>>>>        #14501 FreeBSD: don't verify recycled vnode for zfs control =
direc=3D
> >>> tory
> >>>>>        #14502 partially revert PR 14304 (eee9362a7)
> >>>>>        #14509 Fix per-jail zfs.mount_snapshot setting
> >>>>>        #14514 Fix data race between zil_commit() and zil_suspend()
> >>>>>        #14516 System-wide speculative prefetch limit
> >>>>>        #14517 Use rw_tryupgrade() in dmu_bonus_hold_by_dnode()
> >>>>>        #14519 Do not hold spa_config in ZIL while blocked on IO
> >>>>>        #14523 Move dmu_buf_rele() after dsl_dataset_sync_done()
> >>>>>        #14524 Ignore too large stack in case of dsl_deadlist_merge
> >>>>>        #14526 Use .section .rodata instead of .rodata on FreeBSD
> >>>>>        #14528 ICP: AES-GCM: Refactor gcm_clear_ctx()
> >>>>>        #14529 ICP: AES-GCM: Unify gcm_init_ctx() and gmac_init_ctx(=
)
> >>>>>        #14532 Handle unexpected errors in zil_lwb_commit() without =
ASSER=3D
> >>> T()
> >>>>>        #14544 icp: Prevent compilers from optimizing away memset()
> >>>>>               in gcm_clear_ctx()
> >>>>>        #14546 Revert zfeature_active() to static
> >>>>>        #14556 Remove bad kmem_free() oversight from previous zfsdev=
_stat=3D
> >>> e_list
> >>>>>               patch
> >>>>>        #14563 Optimize the is_l2cacheable functions
> >>>>>        #14565 FreeBSD: zfs_znode_alloc: lock the vnode earlier
> >>>>>        #14566 FreeBSD: fix false assert in cache_vop_rmdir when rep=
layin=3D
> >>> g ZIL
> >>>>>        #14567 spl: Add cmn_err_once() to log a message only on the =
first=3D
> >>>   call
> >>>>>        #14568 Fix incremental receive silently failing for recursiv=
e sen=3D
> >>> ds
> >>>>>        #14569 Restore ASMABI and other Unify work
> >>>>>        #14576 Fix detection of IBM Power8 machines (ISA 2.07)
> >>>>>        #14577 Better handling for future crypto parameters
> >>>>>        #14600 zcommon: Refactor FPU state handling in fletcher4
> >>>>>        #14603 Fix prefetching of indirect blocks while destroying
> >>>>>        #14633 Fixes in persistent error log
> >>>>>        #14639 FreeBSD: Remove extra arc_reduce_target_size() call
> >>>>>        #14641 Additional limits on hole reporting
> >>>>>        #14649 Drop lying to the compiler in the fletcher4 code
> >>>>>        #14652 panic loop when removing slog device
> >>>>>        #14653 Update vdev state for spare vdev
> >>>>>        #14655 Fix cloning into already dirty dbufs
> >>>>>        #14678 Revert "Do not hold spa_config in ZIL while blocked o=
n IO"
> >>>>>
> >>>>>      Obtained from:  OpenZFS
> >>>>>      OpenZFS commit: 431083f75bdd3efaee992bdd672625ec7240d252
> >>>> Another problem related to copy_file_range() is the following exampl=
e.
> >>>>
> >>>> slippy$ df -h build/make/Makefile .
> >>>> Filesystem        Size    Used   Avail Capacity  Mounted on
> >>>> t/wrkdir/amd64     52G     53M     52G     0%    /export/wrkdir/amd6=
4
> >>>> t/wrkdir/amd64     52G     53M     52G     0%    /export/wrkdir/amd6=
4
> >>>> slippy$ cp build/make/Makefile .
> >>>> cp: build/make/Makefile: Cross-device link
> >>>> slippy$
> >>>>
> >>>> And,
> >>>>
> >>>> slippy$ cp y4menc.h foobar
> >>>> cp: y4menc.h: Cross-device link
> >>>> slippy$
> >>>>
> >>>> But the following works because /tmp is mfs and /var/tmp is ufs.
> >>>>
> >>>> slippy$ cp y4menc.h /tmp
> >>>> slippy$
> >>>> slippy$ cp y4menc.h /var/tmp
> >>>> slippy$
> >>>>
> >>>> Copying files from one zpool to a dataset in another zpool also work=
s as
> >>>> does copying files from different datasets on the same zpool. Only c=
opyin=3D
> >>> g
> >>>> files from/to the same dataset results in cross device link.
> >>>>
> >>> zfs_copy_file_range() will only be called if the source and destinati=
on
> >>> both exist on the same mount point. (I'm guessing that's what you mea=
n
> >>> by "same dataset".)
> >> Correct.
> >>
> >> The long term solution to the problem (for users) would be to enable
> >> block_cloning. One can test this by testing, checkpoint the pool, enab=
le
> >> block_cloning, problem goes away, export and import the pool with
> >> --rewind-to-checkpoint to look at the regression again.
> >>
> >> As to why we see a Cross-device link,
> >>
> >> dtrace: script './dtrace.d' matched 1 probe
> >> CPU     ID                    FUNCTION:NAME
> >>    3  26104        vn_copy_file_range:return int64_t 0x12
> >>
> >> 0x12 =3D=3D EXDEV.
> >>
> >> As to why, zfs_clone_range() is called unconditionally at
> >> zfs/zfs_vnops_os.c:6292. If the feature is not enabled zfs_clone_range=
()
> >> sets the return code to EXDEV at line 1083 of zfs_vnops.c. Therefore
> >> block_cloning MUST be enabled.
> >>
> >> To reiterate, the long term solution would be to enable block_cloning.=
 For
> >> those who either wish or need to defer enabling of block_cloning, beca=
use
> >> the zpool is occasionally imported by older versions of freebsd requir=
ing
> >> write acess, or other reason.
> >>
> > If there is a way to test to see if block cloning is enabled at the beg=
inning
> > of zfs_copy_file_range(), it should just call vn_generic_copy_file_rang=
e()
> > if it is not enabled (or something like that).
> >
> > rick
> >
> >> --
> >> Cheers,
> >> Cy Schubert <Cy.Schubert@cschubert.com>
> >> FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
> >> NTP:           <cy@nwtime.org>    Web:  https://nwtime.org
> >>
> >>                          e^(i*pi)+1=3D0
> >>
> >>
> >>

--000000000000e3720005f8835abf
Content-Type: application/octet-stream; name="zfscopy.patch"
Content-Disposition: attachment; filename="zfscopy.patch"
Content-Transfer-Encoding: base64
Content-ID: <f_lg2cidzb0>
X-Attachment-Id: f_lg2cidzb0

LS0tIHN5cy9jb250cmliL29wZW56ZnMvbW9kdWxlL29zL2ZyZWVic2QvemZzL3pmc192bm9wc19v
cy5jLnNhdgkyMDIzLTA0LTAzIDE4OjQ5OjEzLjgzNDQzODAwMCAtMDcwMAorKysgc3lzL2NvbnRy
aWIvb3Blbnpmcy9tb2R1bGUvb3MvZnJlZWJzZC96ZnMvemZzX3Zub3BzX29zLmMJMjAyMy0wNC0w
NCAwNzowOToxNi40MDQ0OTkwMDAgLTA3MDAKQEAgLTYyOTAsNyArNjI5MCw3IEBAIHpmc19mcmVl
YnNkX2NvcHlfZmlsZV9yYW5nZShzdHJ1Y3Qgdm9wX2NvcHlfZmlsZV9yYW5nZQogCQlnb3RvIHVu
bG9jazsKIAogCWVycm9yID0gemZzX2Nsb25lX3JhbmdlKFZUT1ooaW52cCksIGFwLT5hX2lub2Zm
cCwgVlRPWihvdXR2cCksCi0JICAgIGFwLT5hX291dG9mZnAsICZsZW4sIGFwLT5hX2ZzaXpldGQt
PnRkX3VjcmVkKTsKKwkgICAgYXAtPmFfb3V0b2ZmcCwgJmxlbiwgYXAtPmFfb3V0Y3JlZCk7CiAJ
KmFwLT5hX2xlbnAgPSAoc2l6ZV90KWxlbjsKIAogdW5sb2NrOgo=
--000000000000e3720005f8835abf--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy7J0OYjkza02BBBigO9Cw6r%2BYVGtcgJ9StdAMwiuToC2g>