Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Mar 2021 19:09:29 +0100
From:      tuexen@freebsd.org
To:        "Scheffenegger, Richard" <Richard.Scheffenegger@netapp.com>
Cc:        Rick Macklem <rmacklem@uoguelph.ca>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Alexander Motin <mav@FreeBSD.org>
Subject:   Re: NFS Mount Hangs
Message-ID:  <03DE00F1-B60D-49AE-AC53-C83BA9F0F5C7@freebsd.org>
In-Reply-To: <SN4PR0601MB372895EE1F6DDFA830D4B7AC86689@SN4PR0601MB3728.namprd06.prod.outlook.com>
References:  <C643BB9C-6B61-4DAC-8CF9-CE04EA7292D0@tildenparkcapital.com> <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <YQXPR0101MB0968DC18E00833DE2969C636DD6A9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <SN4PR0601MB3728780CE9ADAB144B3B681486699@SN4PR0601MB3728.namprd06.prod.outlook.com> <2890D243-AF46-43A4-A1AD-CB0C3481511D@lurchi.franken.de> <YQXPR0101MB0968D2362456D43DF528A7E9DD699@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <9EE3DFAC-72B0-4256-B57C-DE6AA811413C@freebsd.org> <YQXPR0101MB0968E1537E26CDBDC31C58E5DD689@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <SN4PR0601MB372895EE1F6DDFA830D4B7AC86689@SN4PR0601MB3728.namprd06.prod.outlook.com>

next in thread | previous in thread | raw e-mail | index | archive | help
> On 19. Mar 2021, at 17:14, Scheffenegger, Richard =
<Richard.Scheffenegger@netapp.com> wrote:
>=20
> Hi Rick,
>=20
> I did some reshuffling of socket-upcalls recently in the TCP stack, to =
prevent some race conditions with our $work in-kernel NFS server =
implementation.
Are these changes in 12.1p5? This is the OS version used by the reporter =
of the bug.

Best regards
Michael
>=20
> Just mentioning this, as this may slightly change the timing (mostly =
delay the upcall until TCP processing is all done, while before an =
in-kernel consumer could register for a socket upcall, do some fancy =
stuff with the data sitting in the socket bufferes, before returning to =
the tcp processing).
>=20
> But I think there is no socket data handling being done in the =
upstream in-kernel NFS server (and I have not even checked, if it =
actually registers an socket-upcall handler).
>=20
> =
https://reviews.freebsd.org/R10:4d0770f1725f84e8bcd059e6094b6bd29bed6cc3
>=20
> If you can reproduce this easily, perhaps back out this change and see =
if that has an impact...
>=20
> NFS server is to my knowledge the only upstream in-kernel TCP consumer =
which may be impacted by this.
>=20
> Richard Scheffenegger
>=20
>=20
> -----Urspr=C3=BCngliche Nachricht-----
> Von: owner-freebsd-net@freebsd.org <owner-freebsd-net@freebsd.org> Im =
Auftrag von Rick Macklem
> Gesendet: Freitag, 19. M=C3=A4rz 2021 16:58
> An: tuexen@freebsd.org
> Cc: Scheffenegger, Richard <Richard.Scheffenegger@netapp.com>; =
freebsd-net@freebsd.org; Alexander Motin <mav@FreeBSD.org>
> Betreff: Re: NFS Mount Hangs
>=20
> NetApp Security WARNING: This is an external email. Do not click links =
or open attachments unless you recognize the sender and know the content =
is safe.
>=20
>=20
>=20
>=20
> Michael Tuexen wrote:
>>> On 18. Mar 2021, at 21:55, Rick Macklem <rmacklem@uoguelph.ca> =
wrote:
>>>=20
>>> Michael Tuexen wrote:
>>>>> On 18. Mar 2021, at 13:42, Scheffenegger, Richard =
<Richard.Scheffenegger@netapp.com> wrote:
>>>>>=20
>>>>>>> Output from the NFS Client when the issue occurs # netstat -an |=20=

>>>>>>> grep NFS.Server.IP.X
>>>>>>> tcp        0      0 NFS.Client.IP.X:46896      =
NFS.Server.IP.X:2049       FIN_WAIT2
>>>>>> I'm no TCP guy. Hopefully others might know why the client would=20=

>>>>>> be stuck in FIN_WAIT2 (I vaguely recall this means it is waiting=20=

>>>>>> for a fin/ack, but could be wrong?)
>>>>>=20
>>>>> When the client is in Fin-Wait2 this is the state you end up when =
the Client side actively close() the tcp session, and then the server =
also ACKed the FIN.
>>> Jason noted:
>>>=20
>>>> When the issue occurs, this is what I see on the NFS Server.
>>>> tcp4       0      0 NFS.Server.IP.X.2049      NFS.Client.IP.X.51550 =
    CLOSE_WAIT
>>>>=20
>>>> which corresponds to the state on the client side. The server=20
>>>> received the FIN from the client and acked it.
>>>> The server is waiting for a close call to happen.
>>>> So the question is: Is the server also closing the connection?
>>> Did you mean to say "client closing the connection here?"
>> Yes.
>>>=20
>>> The server should call soclose() { it never calls soshutdown() } =
when=20
>>> soreceive(with MSG_WAIT) returns 0 bytes or an error that indicates=20=

>>> the socket is broken.
> Btw, I looked and the soreceive() is done with MSG_DONTWAIT, but the =
EWOULDBLOCK is handled appropriately.
>=20
>>> --> The soreceive() call is triggered by an upcall for the rcv side =
of the socket.
>>> So, are you saying the FreeBSD NFS server did not call soclose() for =
this case?
>> Yes. If the state at the server side is CLOSE_WAIT, no close call has =
happened yet.
>> The FIN from the client was received, it was ACKED, but no close() =
call=20
>> (or shutdown(..., SHUT_WR) or shutdown(..., SHUT_RDWR)) was issued.=20=

>> Therefore, no FIN was sent and the client should be in the FINWAIT-2=20=

>> state. This was also reported. So the reported states are consistent.
> For a test, I commented out the soclose() call in the server side krpc =
and, when I dismounted, it did leave the server socket in CLOSE_WAIT.
> For the FreeBSD client, it did the dismount and the socket was in =
FIN_WAIT2 for a little while and then disappeared (someone mentioned a =
short timeout and that seems to be the case).
> I might argue that the Linux client should not get hung when this =
occurs, but there does appear to be an issue on the FreeBSD end.
>=20
> So it does appear you have a case where the soclose() call is not =
happening on the FreeBSD NFS server. I am a little surprised since I =
don't think I've heard of this before and the code is at least 10years =
old (at least the parts related to this).
>=20
> For the soclose() to not happen, the reference count on the socket =
structure cannot have gone to zero. (ie a SVC_RELEASE() was missed) Upon =
code inspection, I was not able to spot a reference counting bug.
> (Not too surprising, since a reference counting bug should have shown  =
up long ago.)
>=20
> The only thing I spotted that could conceivably explain this is that =
the function svc_vc_stat() which returns the indication that the socket =
has been closed at the other end did not bother to do any locking when =
it checked the status. (I am not yet sure if this could result in the =
status of XPRT_DIED being missed by the call, but if so, that would =
result in the soclose() call not happening.)
>=20
> I have attached a small patch, which I think is safe, that adds =
locking to svc_vc_stat(),which I am hoping you can try at some point.
> (I realize this is difficult for a production server, but...) I have =
tested it a little and will test it some more, to try and ensure it does =
not break anything.
>=20
> I have also cc'd mav@, since he's the guy who last worked on this =
code, in case he has any insight w.r.t. how the soclose() might get =
missed (or any other way the server socket gets stuck in CLOSE_WAIT).
>=20
> rick
> ps: I'll create a PR for this, so that it doesn't get forgotten.
>=20
> Best regards
> Michael
>=20
>>=20
>> rick
>>=20
>> Best regards
>> Michael
>>> This will last for ~2 min or so, but is asynchronous. However, the =
same 4-tuple can not be reused during this time.
>>>=20
>>> With other words, from the socket / TCP, a properly executed active=20=

>>> close() will end up in this state. (If the other side initiated the=20=

>>> close, a passive close, will not end in this state)
>>>=20
>>>=20
>>> _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to =
"freebsd-net-unsubscribe@freebsd.org"
>>=20
>>=20
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to =
"freebsd-net-unsubscribe@freebsd.org"
>=20
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>=20
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?03DE00F1-B60D-49AE-AC53-C83BA9F0F5C7>