Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Apr 2023 22:18:25 +0200
From:      Martin Matuska <mm@FreeBSD.org>
To:        Cy Schubert <Cy.Schubert@cschubert.com>, Mateusz Guzik <mjguzik@gmail.com>
Cc:        Rick Macklem <rick.macklem@gmail.com>, src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org, Pawel Dawidek <pjd@freebsd.org>
Subject:   Re: git: 8ee579abe09e - main - zfs: fall back if block_cloning feature is disabled
Message-ID:  <88b2e6c2-94d7-81f8-f85d-9d7a6b7b8d11@FreeBSD.org>
In-Reply-To: <20230404181823.0EA79C4@slippy.cwsent.com>
References:  <202304041145.334Bjx6l035872@gitrepo.freebsd.org> <20230404141717.B976D31C@slippy.cwsent.com> <CAGudoHEvGDUQkYe8LwUXgTZZa%2B6DAFXVtspCX-Mn2egDO2oc_w@mail.gmail.com> <CAM5tNy6sPx4xE%2BcAeeC_RQG_tba_K6Yh-Cni0%2B-WxJ5SXCuO9A@mail.gmail.com> <98c71e6f-5b48-79f3-e7b0-95d674949624@FreeBSD.org> <20230404091844.639cb1c1@slippy> <20230404093418.3041ff23@slippy> <CAGudoHHe3FJmW_cEddozQScJcwfbdbbfEn=y%2Bm6wwmzmvEMb-w@mail.gmail.com> <20230404181823.0EA79C4@slippy.cwsent.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I agree that these are three individual fixes.

1.) pass ap->a_outcred instead of ap->a_fsizetd->td_ucred to 
zfs_clone_range()
I am ok with this, the way the argument is subsequently used it should 
be ap->a_outcred which is intended for the write.

2.) do a vn_generic_copy_file_range() in case of EXDEV

The comment vn_generic_copy_file_range() says:
/*
  * Copy a byte range of one file to another.  This function can handle the
  * case where invp and outvp are on different file systems.
  * It can also be called by a VOP_COPY_FILE_RANGE() to do the work, if 
there
  * is no better file system specific way to do it.
  */

That is actually our case. zfs_clone_range() exits with EXDEV if:
- source and destination are not on the same pool
- the block_cloning feature is not enabled
- input and output files have a different block size
- offset and len are not at block boundaries
- length is not a multiple of block size, with except for the end of file
- we are trying to clone a block created in the current transaction group
- we are cloning encrypted data not in the same dataset

IMO we can fallback to vn_generic_copy_file_range() in all of these cases.

As of the locks, we need to run vn_generic_copy_file_range() on unlocked 
vnodes,
just look into the function.
In both fuse_vnops.c and nfs_clvnops.c it does not run on locked vnodes.
Even the comment from pjd in zfs_vnops_os.c says:
         /*
          * TODO: If offset/length is not aligned to recordsize, use
          * vn_generic_copy_file_range() on this fragment.
          * It would be better to do this after we lock the vnodes, but 
then we
          * need something else than vn_generic_copy_file_range().
          */

So IMO it should be at the end after unlock.

3.) By doing the feature check early, we save locking the input vnode 
and calling mac_vnode_check_write() and vn_rlimit_fsize() at the cost of 
checking for the disabled feature twice. Maybe documented skipping of 
the check in zfs_clone_range()? The code of the early check looks ok to me.

On 4. 4. 2023 20:18, Cy Schubert wrote:
> In message <CAGudoHHe3FJmW_cEddozQScJcwfbdbbfEn=y+m6wwmzmvEMb-w@mail.gmail.c
> om>
> , Mateusz Guzik writes:
>> can you please post a review
> I could but I didn't write any of it. Rick Macklem and Martin Matuska wrote
> it. My patch was for discussion only.
>
> Martin and Rick, do you mind if I post this as a review. It should probably
> be two, maybe three separate commits, fixing two different problems.
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?88b2e6c2-94d7-81f8-f85d-9d7a6b7b8d11>