Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 21 Mar 2021 09:26:33 -0400
From:      Jason Breitman <jbreitman@tildenparkcapital.com>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        "tuexen@freebsd.org" <tuexen@freebsd.org>, "Scheffenegger, Richard" <Richard.Scheffenegger@netapp.com>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Alexander Motin <mav@FreeBSD.org>
Subject:   Re: NFS Mount Hangs
Message-ID:  <B228BB92-089C-4C98-9EAD-23F65EB5AEC1@tildenparkcapital.com>
In-Reply-To: <F9F3122F-D5B5-4D4E-8C6B-89EDFF0ED6C4@tildenparkcapital.com>
References:  <C643BB9C-6B61-4DAC-8CF9-CE04EA7292D0@tildenparkcapital.com> <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <YQXPR0101MB0968DC18E00833DE2969C636DD6A9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <SN4PR0601MB3728780CE9ADAB144B3B681486699@SN4PR0601MB3728.namprd06.prod.outlook.com> <2890D243-AF46-43A4-A1AD-CB0C3481511D@lurchi.franken.de> <YQXPR0101MB0968D2362456D43DF528A7E9DD699@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <9EE3DFAC-72B0-4256-B57C-DE6AA811413C@freebsd.org> <YQXPR0101MB0968E1537E26CDBDC31C58E5DD689@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <BC4B2749-F92D-4F47-BA3F-6C77AF015AF8@tildenparkcapital.com> <YQXPR0101MB096861283CE48450784B7766DD689@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <F9F3122F-D5B5-4D4E-8C6B-89EDFF0ED6C4@tildenparkcapital.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_BEE5742E-2729-49A6-B677-C2F0A9D01A69
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=WINDOWS-1252

The issue did trigger again.
I ran the script below for ~15 minutes and hope this gets you what you need=
.
Let me know if you require the full output without grepping nfsd.


#!/bin/sh

while true
do
   /bin/date >> /tmp/nfs-hang.log
   /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log
   /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log
   /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log
   /bin/sleep 60
done


On the NFS Server
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address          Foreign Address        (state)
tcp4       0      0 NFS.Server.IP.X.2049      NFS.Client.IP.X.48286     CLO=
SE_WAIT=20

On the NFS Client
tcp        0      0 NFS.Client.IP.X:48286      NFS.Server.IP.X:2049       F=
IN_WAIT2



--Apple-Mail=_BEE5742E-2729-49A6-B677-C2F0A9D01A69
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset=WINDOWS-1252



Jason Breitman


On Mar 19, 2021, at 5:17 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote:

Jason Breitman wrote:
> Thank you for your focus on the issue I am having and I look forward to s=
eeing your >patch ported to FreeBSD 12.X.
I'll only be committing the patch I am convinced it actually fixes somethin=
g.
I'll be looking more closely at it and seeing what mav@ thinks aboutm it.

> I also appreciate that you understand the difficulties in testing changes=
 on a core >piece of infrastructure.
>=20
> I will let the group know if the issue occurs following the change that d=
isabled TSO >and LRO on my FreeBSD NFS Server NICs.
> Those types of settings are less disruptive and pose less risk to apply i=
n production, >so I am happy to do so. That change was made on 3/17/2021.
> My hope is that this issue is a race condition specific to my NIC handlin=
g certain TCP >operations.
>=20
> I am happy that you are approaching the issue from multiple angles.
Well, another angle would be a nfsd thread that is "stuck" while doing
an RPC. If that happens, it won't call svc_freereq(), which derefs the
structure.
If it happens again and you can do, please go onto the FreeBSD server
and perform the following commands a few times, a few minutes apart,
capturing the output:
# ps axHl
# procstat -kk <pid of nfsd (server) as listed by "ps ax")
If the output doesn't have proprietary or security stuff, email me the resu=
lts.
(Or if you can sanitize it.)
--> I'll be looking for an nfsd thread that is sleeping on the same thing
without cpu increasing (except for rpcsvc, which is just a thread waiting
for an RPC to work on).
--> The procstat -kk gives more information w.r.t what the thread is up to.

Good luck with it, rick

Thanks.

Jason Breitman


On Mar 19, 2021, at 11:58 AM, Rick Macklem <rmacklem@uoguelph.ca> wrote:

Michael Tuexen wrote:
>> On 18. Mar 2021, at 21:55, Rick Macklem <rmacklem@uoguelph.ca> wrote:
>>=20
>> Michael Tuexen wrote:
>>>> On 18. Mar 2021, at 13:42, Scheffenegger, Richard <Richard.Scheffenegg=
er@netapp.com> wrote:
>>>>=20
>>>>>> Output from the NFS Client when the issue occurs # netstat -an | gre=
p
>>>>>> NFS.Server.IP.X
>>>>>> tcp 0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 FIN_WAIT2
>>>>> I'm no TCP guy. Hopefully others might know why the client would be s=
tuck in FIN_WAIT2 (I vaguely recall this means it is waiting for a fin/ack,=
 but could be wrong?)
>>>>=20
>>>> When the client is in Fin-Wait2 this is the state you end up when the =
Client side actively close() the tcp session, and then the server also ACKe=
d the FIN.
>> Jason noted:
>>=20
>>> When the issue occurs, this is what I see on the NFS Server.
>>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.51550 CLOSE_WAIT
>>>=20
>>> which corresponds to the state on the client side. The server received =
the FIN
>>> from the client and acked it.
>>> The server is waiting for a close call to happen.
>>> So the question is: Is the server also closing the connection?
>> Did you mean to say "client closing the connection here?"
> Yes.
>>=20
>> The server should call soclose() { it never calls soshutdown() } when
>> soreceive(with MSG_WAIT) returns 0 bytes or an error that indicates
>> the socket is broken.
Btw, I looked and the soreceive() is done with MSG_DONTWAIT, but the
EWOULDBLOCK is handled appropriately.

>> --> The soreceive() call is triggered by an upcall for the rcv side of t=
he socket.
>> So, are you saying the FreeBSD NFS server did not call soclose() for thi=
s case?
> Yes. If the state at the server side is CLOSE_WAIT, no close call has hap=
pened yet.
> The FIN from the client was received, it was ACKED, but no close() call
> (or shutdown(..., SHUT_WR) or shutdown(..., SHUT_RDWR)) was issued. There=
fore,
> no FIN was sent and the client should be in the FINWAIT-2 state. This was=
 also
> reported. So the reported states are consistent.
For a test, I commented out the soclose() call in the server side krpc and,=
 when I
dismounted, it did leave the server socket in CLOSE_WAIT.
For the FreeBSD client, it did the dismount and the socket was in FIN_WAIT2
for a little while and then disappeared (someone mentioned a short timeout
and that seems to be the case).
I might argue that the Linux client should not get hung when this occurs,
but there does appear to be an issue on the FreeBSD end.

So it does appear you have a case where the soclose() call is not happening
on the FreeBSD NFS server. I am a little surprised since I don't think I've
heard of this before and the code is at least 10years old (at least the par=
ts
related to this).

For the soclose() to not happen, the reference count on the socket
structure cannot have gone to zero. (ie a SVC_RELEASE() was missed)
Upon code inspection, I was not able to spot a reference counting bug.
(Not too surprising, since a reference counting bug should have shown
up long ago.)

The only thing I spotted that could conceivably explain this is that the
function svc_vc_stat() which returns the indication that the socket has
been closed at the other end did not bother to do any locking when
it checked the status. (I am not yet sure if this could result in the
status of XPRT_DIED being missed by the call, but if so, that would
result in the soclose() call not happening.)

I have attached a small patch, which I think is safe, that adds locking
to svc_vc_stat(),which I am hoping you can try at some point.
(I realize this is difficult for a production server, but...)
I have tested it a little and will test it some more, to try and ensure
it does not break anything.

I have also cc'd mav@, since he's the guy who last worked on this
code, in case he has any insight w.r.t. how the soclose() might get
missed (or any other way the server socket gets stuck in CLOSE_WAIT).

rick
ps: I'll create a PR for this, so that it doesn't get forgotten.

Best regards
Michael

>=20
> rick
>=20
> Best regards
> Michael
>> This will last for ~2 min or so, but is asynchronous. However, the same =
4-tuple can not be reused during this time.
>>=20
>> With other words, from the socket / TCP, a properly executed active clos=
e() will end up in this state. (If the other side initiated the close, a pa=
ssive close, will not end in this state)
>>=20
>>=20
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>=20
>=20
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

<xprtdied.patch>_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"





--Apple-Mail=_BEE5742E-2729-49A6-B677-C2F0A9D01A69--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B228BB92-089C-4C98-9EAD-23F65EB5AEC1>