Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Dec 2020 16:48:27 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>, Alan Somers <asomers@freebsd.org>, Kirk McKusick <mckusick@mckusick.com>, Mark Johnston <markj@freebsd.org>
Subject:   Re: r367672 broke the NFS server
Message-ID:  <YQXPR0101MB09687E622BDBBE59C964FAC4DDD70@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <X%2Bx4NSeWI%2Bz5QkP3@kib.kiev.ua>
References:  <YQXPR0101MB0968DC349AFD081480768028DDD70@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>, <X%2Bx4NSeWI%2Bz5QkP3@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
Kostik wrote:=0A=
>On Wed, Dec 30, 2020 at 02:02:48AM +0000, Rick Macklem wrote:=0A=
>> Hi,=0A=
>>=0A=
>> Post r367671...=0A=
>> When multiple files are being created by an NFS client in the same=0A=
>> directory, the VOP_CREATE()/ufs_create() can fail with ERELOOKUP.=0A=
>> This results in a EIO return to the NFS client.=0A=
>> --> This causes "nfsv4 client/server protocol prob err=3D10026"=0A=
>>       on the client for NFSv4.0 mounts.=0A=
>>       --> This explains why this error has been reported by=0A=
>>             several people lately, although it should "never happen".=0A=
>>=0A=
>> Unfortunately, for the NFS server, the Lookup call is done separately=0A=
>> and it will not be easy to redo it, given the current NFS code structure=
.=0A=
>>=0A=
>> Is there another way to deal with the problem r367672 was fixing that=0A=
>> avoids ufs_create() returning ERELOOKUP?=0A=
>=0A=
>Idea of the change is to restart the syscall at top level.  So for NFS=0A=
>server the right approach is to not send a response and also to not=0A=
>free the request mbuf chain, but to restart processing.=0A=
Yes. I took a look and I think restarting the operation by rolling the=0A=
working position in the mbuf lists back and redoing the operation=0A=
is feasible and easier than fixing the individual operations.=0A=
=0A=
For NFSv4, you cannot redo the entire compound, since non-idempotent=0A=
operations like exclusive open may have already been completed.=0A=
However, rolling back to the beginning of the operation should be=0A=
doable.=0A=
--> It will serve as a good test, in that it may expose bugs in the=0A=
      RPC/operation code where failure (ERELOOKUP) doesn't clean=0A=
      things up correctly.=0A=
      --> In NFSv4, there is the open/lock state that cannot be updated=0A=
            for this error case. (The seqid stuff in NFSv4.0 Open can be fu=
n.=0A=
            Its used to serialize the operations and the number must be=0A=
            incremented for some errors, but not for others. The 10026=0A=
            error occurs when you don't get this right.)=0A=
=0A=
I'll start working on this to-day, but I have no idea how long it might=0A=
take?=0A=
=0A=
>I am sorry I forgot about NFS server when designing this fix, the only=0A=
>mild excuse I can provide is that the change was quite complicated as is.=
=0A=
>I will start looking at the fix.=0A=
No problem. Sometimes I'd like to forget about NFS too;-).=0A=
=0A=
For the rollback/redo the RPC/operation case, it's probably easier for me=
=0A=
to do it. As above, I'll start on it, but...=0A=
=0A=
My main concern is how long it will take, given the FreeBSD13 release=0A=
starts soon.=0A=
=0A=
rick=0A=
_______________________________________________=0A=
freebsd-current@freebsd.org mailing list=0A=
https://lists.freebsd.org/mailman/listinfo/freebsd-current=0A=
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"=
=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB09687E622BDBBE59C964FAC4DDD70>