Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Sep 2020 15:21:31 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Alan Somers <asomers@freebsd.org>, Konstantin Belousov <kostikbel@gmail.com>
Cc:        "src-committers@freebsd.org" <src-committers@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r365643 - head/bin/cp
Message-ID:  <YTBPR01MB3966BD35AAA1D4D7E183FE2BDD3D0@YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <CAOtMX2gtYznWO9UP7f7iryAgh7ngQ%2B5U4L-V5kNp=M2AtD-sLA@mail.gmail.com>
References:  <202009112049.08BKnavL032212@repo.freebsd.org> <20200911214327.GY94807@kib.kiev.ua> <YTBPR01MB3966E158BCA0C56C5EE64CA8DD240@YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM> <CAOtMX2hLLE1c=qbhSFnoRk2SfAg9sOk0PZtiyvNZ_u39YwGKmA@mail.gmail.com> <YTBPR01MB3966A515F676EB5B938AC007DD3C0@YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM> <20200919233232.GC94807@kib.kiev.ua>, <CAOtMX2gtYznWO9UP7f7iryAgh7ngQ%2B5U4L-V5kNp=M2AtD-sLA@mail.gmail.com>

index | next in thread | previous in thread | raw e-mail

Alan Somers wrote:
>On Sat, Sep 19, 2020 at 5:32 PM Konstantin Belousov <kostikbel@gmail.com<mailto:kostikbel@gmail.com>> wrote:
>On Sat, Sep 19, 2020 at 11:18:56PM +0000, Rick Macklem wrote:
>> Alan Somers wrote:
>> >On Fri, Sep 11, 2020 at 3:52 PM Rick Macklem <rmacklem@uoguelph.ca<mailto:rmacklem@uoguelph.ca><mailto:rmacklem@uoguelph.ca<mailto:rmacklem@uoguelph.ca>>> wrote:
>> >Konstantin Belousov wrote:
>> >>On Fri, Sep 11, 2020 at 08:49:36PM +0000, Alan Somers wrote:
>> >>> Author: asomers
>> >>> Date: Fri Sep 11 20:49:36 2020
>> >>> New Revision: 365643
>> >>> URL: https://svnweb.freebsd.org/changeset/base/365643
>> >>>
>> >>> Log:
>> >>>   cp: fall back to read/write if copy_file_range fails
>> >>>
>> >>>   Even though copy_file_range has a file-system agnostic version, it still
>> >>>   fails on devfs (perhaps because the file descriptor is non-seekable?) In
>> >>>   that case, fallback to old-fashioned read/write. Fixes
>> >>>   "cp /dev/null /tmp/null"
>> >>
>> >>Devices are seekable.
>> >>
>> >>The reason for EINVAL is that vn_copy_file_range() checks that both in and out
>> >>vnodes are VREG.  For devfs, they are VCHR.
>> >
>> >I coded the syscall to the Linux man page, which states that EINVAL is returned
>> >if either fd does not refer to a regular file.
>> >Having said that, I do not recall testing the VCHR case under Linux. (ie. It might
>> >actually work and the man page turns out to be incorrect?)
>> >
>> >I will test this case under Linux when I get home next week, rick
>> I'll admit I haven't tested this in Linux to see if they do return EINVAL.
>>
>> >Since there's no standard, I think it's fine for us to support devfs if possible.
>> 1 - I think this is a good question for a mailing list like freebsd-current@.
>> 2 - I see Linux as the de-facto standard these days and consider POSIX no
>>       longer relevant, but that's just mho.
>> 3 - For NFSv4.2, the Copy operation will fail for non-regular files, so if you
>>       do this, you will need to handle the fall-back to using the generic code.
>>       (Should be doable, but you need to be aware of this case.)
>>
>> Having said the above, it is up to the "collective" and not me and, as such,
>> I suggest #1, to see whether others think doing a non-Linux compatible
>> version makes sense for FreeBSD?
>
>I believe that allowing devfs nodes for vn_copy_file() is not very good
>idea.  For /dev/null driver returns EOF, but think about real devices or
>even better, /dev/zero that never EOF its output.
>
>Is vn_copy_file() interruptible ?  I think not.  So if insane range is
>specified, we have unstoppable copier that fills the disk (at best).
I think this is a serious problem, but the code could clip the "len" argument
at K Mbytes for non-VREG files to avoid it (and document that FreeBSD
specific behaviour in the man page).

>I can think of good use cases for copy_file_range on a device:
>
>1) Network block devices.  I don't know if the iSCSI, NBD, or Ceph RBD protocols >currently support server-side copies, but it's reasonable that they might.  If they >ever do, FreeBSD would need copy_file_range to take advantage.
>2) CUSE.  I think Linux's CUSE already supports copy_file_range, since a CUSE >device on Linux is basically just a single-file FUSE file system.  We might add >support to our CUSE driver someday.
>3) zvols.  This is the use case that matters the most to me.  I have a large amount >of data stored in plain files that I would like to convert to zvols.  dd should be able >to do that using copy_file_range.
>
>In my opinion, the utility of those cases outweighs the risk of a long-running >interruptible syscall.  And in any case, it is documented that copy_file_range may >return EINTR.
I believe that the only case where EINTR would be returned is for NFS mounts
with the "intr" option.
The generic code uses vn_rdwr()->VOP_READ()/VOP_WRITE() and I think the
behaviour w.r.t. signal handling is the same as read(2)/write(2).

Is reducing the number of syscalls really going to speed up the above cases?
(I did copy_file_range(2) because the copy could be done locally on the NFSv4.2
 server. I didn't intend the generic code to be used over read(2)/write(2) to
 improve performance.)
--> I'd suggest you try benchmarking a pre-patched vs current "cp" to copy
      regular files (not a NFSv4.2 mount) and see if there really is a significant
      benefit.

I'll admit I would prefer a Linux-compatible syscall and think this should
be asked on an open mailing list instead of here.

rick

-Alan



help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTBPR01MB3966BD35AAA1D4D7E183FE2BDD3D0>