Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Sep 2023 16:52:20 -0700
From:      Alan Somers <asomers@freebsd.org>
To:        Rick Macklem <rick.macklem@gmail.com>
Cc:        Alan Somers <asomers@freebsd.org>, Freebsd fs <freebsd-fs@freebsd.org>
Subject:   Re: RFC: Should copy_file_range(2) work for shared memory objects?
Message-ID:  <CAOtMX2jDBShJFnuW%2BgW2w%2Bt3wr%2BX=Z64zU%2BdP09bGHJSA0_wCw@mail.gmail.com>
In-Reply-To: <CAM5tNy5aFR=YGd4FYYAzNuZx_-rqaZyx5YjHmeZy1LGiM3BThw@mail.gmail.com>
References:  <CAM5tNy4HxY8LK0f6baGhu=opoC3-4ODhqNyxoyPY8vdwxGs5Xg@mail.gmail.com> <CAOtMX2jojm01Xx9rfOdPmevWb8TasJ27U5u6GT3n3NiWwYwYoQ@mail.gmail.com> <CAM5tNy4vYNg=fbmNujcrusjOynPFBaoqBKckb1wa=V_-HJ_r0Q@mail.gmail.com> <CAM5tNy65PGAPV3MSERdnrO%2BN_P9FcGo-AqZQ8ew5DDJP7nunuw@mail.gmail.com> <CAM5tNy5aFR=YGd4FYYAzNuZx_-rqaZyx5YjHmeZy1LGiM3BThw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Sep 20, 2023 at 4:47=E2=80=AFPM Rick Macklem <rick.macklem@gmail.co=
m> wrote:
>
> On Wed, Sep 20, 2023 at 4:21=E2=80=AFPM Rick Macklem <rick.macklem@gmail.=
com> wrote:
> >
> > On Wed, Sep 20, 2023 at 4:09=E2=80=AFPM Rick Macklem <rick.macklem@gmai=
l.com> wrote:
> > >
> > > On Wed, Sep 20, 2023 at 3:07=E2=80=AFPM Alan Somers <asomers@freebsd.=
org> wrote:
> > > >
> > > > On Wed, Sep 20, 2023 at 3:05=E2=80=AFPM Rick Macklem <rick.macklem@=
gmail.com> wrote:
> > > > >
> > > > > Right now (as noted by PR#273962) copy_file_range(2)
> > > > > fails for shared memory objects because there is no
> > > > > vnode (f_vnode =3D=3D NULL) for them and the code uses
> > > > > vnodes (including a file system specific VOP_COPY_FILE_RANGE(9)).
> > > > >
> > > > > Do you think copy_file_range(2) should work for shared memory obj=
ects?
> > > > >
> > > > > This would require specific handling in kern_copy_file_range()
> > > > > to work.  I do not think the patch would be a lot of work, but
> > > > > I am not familiar with the f_ops and shared memory code.
> > > > >
> > > > > rick
> > > >
> > > > This sounds annoying to fix.  But I think we ought to.  Right now
> > > > programmers can assume that copy_file_range will work for every typ=
e
> > > > of file.  We don't document an EOPNOTSUP error code or anything lik=
e
> > > > that.  Does it work on sockets, too?
> > > No. I guess I have a different definition of "file" (unless you meant
> > > "filedesc"?).  I cannot see how a "range is defined for sockets
> > > or named pipes or...". It currently checks for a f_vnode, which
> > > probably is not enough. (I haven't figured out what path_fileops
> > > are, so I do not know if they work?)
> > >
> > > I can see how it can be implemented for shared memory objects.
> > > However, this is going to take a fair amount of work, since they
> > > do not use vnodes.
> > > I think it goes something like this:
> > > - Create a new fileops (f_copy_file_range), since it needs to use
> > >   the correct range lock variables (in shmfd instead of vnode ones).
> > > - Move most of kern_copy_file_range() into vnodeop_copy_file_range()
> > >   and call f_copy_file_range() from kern_copy_file_range().
> > > - Create a shm_copy_file_range() that does the correct range locking
> > >   and then copies via uiomove().
> > > This would be a KABI change, so I do not think it could be MFC'd.
> > >
> > > I think there is a need for copy_file_range(2) to return EOPNOTSUP
> > > for cases it will never handle. (I need to test AF_LOCAL sockets,
> > > since I think they have vnodes?)
> > copy_file_range(2) does currently return EOPNOTSUPP for unix
> > domain (AF_LOCAL) sockets. The man page needs to be fixed,
> > whether or not support for shared memory objects is added.
> >
> Oops, my mistake.  It was the open(2) that failed with EOPNOTSUPP,
> not copy_file_range(2). (I have a simple test program that open(2)s
> file names and then uses copy_file_range(2) on the descriptors.
> Btw, an open(2) with O_PATH works, but no data is copied.
> Not sure if that should be considered correct behaviour?

Do you mean that copy_file_range returns 0 for AF_LOCAL sockets?  That
sounds suspicious.  0 could be interpreted as EoF.  Could you please
share your test program?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2jDBShJFnuW%2BgW2w%2Bt3wr%2BX=Z64zU%2BdP09bGHJSA0_wCw>