Date: Sun, 20 Sep 2020 15:21:31 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Alan Somers <asomers@freebsd.org>, Konstantin Belousov <kostikbel@gmail.com> Cc: "src-committers@freebsd.org" <src-committers@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org> Subject: Re: svn commit: r365643 - head/bin/cp Message-ID: <YTBPR01MB3966BD35AAA1D4D7E183FE2BDD3D0@YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <CAOtMX2gtYznWO9UP7f7iryAgh7ngQ%2B5U4L-V5kNp=M2AtD-sLA@mail.gmail.com> References: <202009112049.08BKnavL032212@repo.freebsd.org> <20200911214327.GY94807@kib.kiev.ua> <YTBPR01MB3966E158BCA0C56C5EE64CA8DD240@YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM> <CAOtMX2hLLE1c=qbhSFnoRk2SfAg9sOk0PZtiyvNZ_u39YwGKmA@mail.gmail.com> <YTBPR01MB3966A515F676EB5B938AC007DD3C0@YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM> <20200919233232.GC94807@kib.kiev.ua>, <CAOtMX2gtYznWO9UP7f7iryAgh7ngQ%2B5U4L-V5kNp=M2AtD-sLA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Alan Somers wrote:=0A= >On Sat, Sep 19, 2020 at 5:32 PM Konstantin Belousov <kostikbel@gmail.com<m= ailto:kostikbel@gmail.com>> wrote:=0A= >On Sat, Sep 19, 2020 at 11:18:56PM +0000, Rick Macklem wrote:=0A= >> Alan Somers wrote:=0A= >> >On Fri, Sep 11, 2020 at 3:52 PM Rick Macklem <rmacklem@uoguelph.ca<mail= to:rmacklem@uoguelph.ca><mailto:rmacklem@uoguelph.ca<mailto:rmacklem@uoguel= ph.ca>>> wrote:=0A= >> >Konstantin Belousov wrote:=0A= >> >>On Fri, Sep 11, 2020 at 08:49:36PM +0000, Alan Somers wrote:=0A= >> >>> Author: asomers=0A= >> >>> Date: Fri Sep 11 20:49:36 2020=0A= >> >>> New Revision: 365643=0A= >> >>> URL: https://svnweb.freebsd.org/changeset/base/365643=0A= >> >>>=0A= >> >>> Log:=0A= >> >>> cp: fall back to read/write if copy_file_range fails=0A= >> >>>=0A= >> >>> Even though copy_file_range has a file-system agnostic version, it= still=0A= >> >>> fails on devfs (perhaps because the file descriptor is non-seekabl= e?) In=0A= >> >>> that case, fallback to old-fashioned read/write. Fixes=0A= >> >>> "cp /dev/null /tmp/null"=0A= >> >>=0A= >> >>Devices are seekable.=0A= >> >>=0A= >> >>The reason for EINVAL is that vn_copy_file_range() checks that both in= and out=0A= >> >>vnodes are VREG. For devfs, they are VCHR.=0A= >> >=0A= >> >I coded the syscall to the Linux man page, which states that EINVAL is = returned=0A= >> >if either fd does not refer to a regular file.=0A= >> >Having said that, I do not recall testing the VCHR case under Linux. (i= e. It might=0A= >> >actually work and the man page turns out to be incorrect?)=0A= >> >=0A= >> >I will test this case under Linux when I get home next week, rick=0A= >> I'll admit I haven't tested this in Linux to see if they do return EINVA= L.=0A= >>=0A= >> >Since there's no standard, I think it's fine for us to support devfs if= possible.=0A= >> 1 - I think this is a good question for a mailing list like freebsd-curr= ent@.=0A= >> 2 - I see Linux as the de-facto standard these days and consider POSIX n= o=0A= >> longer relevant, but that's just mho.=0A= >> 3 - For NFSv4.2, the Copy operation will fail for non-regular files, so = if you=0A= >> do this, you will need to handle the fall-back to using the generi= c code.=0A= >> (Should be doable, but you need to be aware of this case.)=0A= >>=0A= >> Having said the above, it is up to the "collective" and not me and, as s= uch,=0A= >> I suggest #1, to see whether others think doing a non-Linux compatible= =0A= >> version makes sense for FreeBSD?=0A= >=0A= >I believe that allowing devfs nodes for vn_copy_file() is not very good=0A= >idea. For /dev/null driver returns EOF, but think about real devices or= =0A= >even better, /dev/zero that never EOF its output.=0A= >=0A= >Is vn_copy_file() interruptible ? I think not. So if insane range is=0A= >specified, we have unstoppable copier that fills the disk (at best).=0A= I think this is a serious problem, but the code could clip the "len" argume= nt=0A= at K Mbytes for non-VREG files to avoid it (and document that FreeBSD=0A= specific behaviour in the man page).=0A= =0A= >I can think of good use cases for copy_file_range on a device:=0A= >=0A= >1) Network block devices. I don't know if the iSCSI, NBD, or Ceph RBD pro= tocols >currently support server-side copies, but it's reasonable that they= might. If they >ever do, FreeBSD would need copy_file_range to take advan= tage.=0A= >2) CUSE. I think Linux's CUSE already supports copy_file_range, since a C= USE >device on Linux is basically just a single-file FUSE file system. We = might add >support to our CUSE driver someday.=0A= >3) zvols. This is the use case that matters the most to me. I have a lar= ge amount >of data stored in plain files that I would like to convert to zv= ols. dd should be able >to do that using copy_file_range.=0A= >=0A= >In my opinion, the utility of those cases outweighs the risk of a long-run= ning >interruptible syscall. And in any case, it is documented that copy_f= ile_range may >return EINTR.=0A= I believe that the only case where EINTR would be returned is for NFS mount= s=0A= with the "intr" option.=0A= The generic code uses vn_rdwr()->VOP_READ()/VOP_WRITE() and I think the=0A= behaviour w.r.t. signal handling is the same as read(2)/write(2).=0A= =0A= Is reducing the number of syscalls really going to speed up the above cases= ?=0A= (I did copy_file_range(2) because the copy could be done locally on the NFS= v4.2=0A= server. I didn't intend the generic code to be used over read(2)/write(2) = to=0A= improve performance.)=0A= --> I'd suggest you try benchmarking a pre-patched vs current "cp" to copy= =0A= regular files (not a NFSv4.2 mount) and see if there really is a sign= ificant=0A= benefit.=0A= =0A= I'll admit I would prefer a Linux-compatible syscall and think this should= =0A= be asked on an open mailing list instead of here.=0A= =0A= rick=0A= =0A= -Alan=0A= =0A=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTBPR01MB3966BD35AAA1D4D7E183FE2BDD3D0>