Date: Sun, 11 May 2025 16:00:05 -0700 From: Rick Macklem <rick.macklem@gmail.com> To: David Chen <david.chen@peakaio.com> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: support for pNFS with Linux as Data Servers Message-ID: <CAM5tNy6su50fk65rW=fz5pmmp9EU%2Bn6aXFg-epPHunzNuJ=bcQ@mail.gmail.com> In-Reply-To: <CWLP123MB42436EF96425844C55A8020A9D94A@CWLP123MB4243.GBRP123.PROD.OUTLOOK.COM>
index | next in thread | previous in thread | raw e-mail
On Sat, May 10, 2025 at 8:26 PM David Chen <david.chen@peakaio.com> wrote:
>
> Hello!
>
> Currently, FreeBSD supports pNFS with File and Flexible File
> layouts. For the Flexible File layout, FreeBSD supports a tightly
> coupled locking model that requires FreeBSD servers as Data Servers
> (DSs). I'm interested in adding support for a loosely coupled locking
> model that would allow Linux machines to be used as Data Servers. To
> that end, I'd like to gauge interest and get feedback about that.
>
> AFAICT there are two changes needed to add this support. The first
> change is to allow for file handles of variable length and up to the
> maximum size allowed by the RFC. This seems relatively
> straightforward, and could be done by changing a bunch of uses of
> fhandle_t to instead use a data structure that can store larger file
> handles. Also, nfsrv_layoutget() and the use of NFSX_V4FLEXLAYOUT
> would need to be updated to allow for variable lengths.
>
> The second change is to support a loosely coupled locking
> model. Besides trivially setting the ffdv_tightly_coupled flag of
> ff_device_versions4 to false, I see two immediate issues. One issue is
> the FreeBSD's special 0x555555555555555555555555 state ID (along with
> its special seq num 0xffffffff) is, naturally, not understood by a
> Linux Data Server.
Although it does not explicitly say so in the RFC, you want to use NFSv3
RPCs to talk to the DS(s) from the MDS for the loosely coupled variant.
(That avoids any stateid hassles. For NFSv4 DSs, the MDS would have to
do Opens and keep open_stateids for the DS files.)
--> The functions that do RPCs against the DS(s) from the MDS will need
to be patched to do NFSv3 RPCs as well as NFSv4 ones.
> The other issue is clients will use the synthetic
> uid/gid given by the MDS (currently 999/999), and this results in
> access errors when the clients talk to the DSs.
The NFSv3 Create RPC that creates the DS file would set it owned
by the uid/gid and mode 0600, I think?
>
> I've made some of these changes in a rough manner, not in a
> production-ready way, and I/O seems to work and some basic tests pass.
>
> Is the FreeBSD community interested in development in this direction,
> i.e. pNFS using FreeBSD for MDS and Linux for DSs?
As I noted, I think "fencing" is where most of the work is.
If I recall it correctly, it goes something like this:
- Client does a Setattr of owner/group/mode/ACL on the MDS.
--> Server must recall all layouts for the file via CB_RECALLLAYOUT
callbacks and reply NFS4ERR_DELAY to the Setattr.
--> Sometime later, the client retries the Setattr. If all layouts have been
returned, it is done. If not, the server must either return NFS4ERR_DELAY
again or change the mode on the file on the DS(s) so that clients cannot
access it. I think the MDS must wait at least one lease duration (2min)
after issuing the CB_RECALLLAYOUTs before doing this.
rick
>
> Thanks!
>
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy6su50fk65rW=fz5pmmp9EU%2Bn6aXFg-epPHunzNuJ=bcQ>
