Date: Fri, 19 Mar 2021 16:14:01 +0000 From: "Scheffenegger, Richard" <Richard.Scheffenegger@netapp.com> To: Rick Macklem <rmacklem@uoguelph.ca>, "tuexen@freebsd.org" <tuexen@freebsd.org> Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Alexander Motin <mav@FreeBSD.org> Subject: AW: NFS Mount Hangs Message-ID: <SN4PR0601MB372895EE1F6DDFA830D4B7AC86689@SN4PR0601MB3728.namprd06.prod.outlook.com> In-Reply-To: <YQXPR0101MB0968E1537E26CDBDC31C58E5DD689@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> References: <C643BB9C-6B61-4DAC-8CF9-CE04EA7292D0@tildenparkcapital.com> <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <YQXPR0101MB0968DC18E00833DE2969C636DD6A9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <SN4PR0601MB3728780CE9ADAB144B3B681486699@SN4PR0601MB3728.namprd06.prod.outlook.com> <2890D243-AF46-43A4-A1AD-CB0C3481511D@lurchi.franken.de> <YQXPR0101MB0968D2362456D43DF528A7E9DD699@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>, <9EE3DFAC-72B0-4256-B57C-DE6AA811413C@freebsd.org> <YQXPR0101MB0968E1537E26CDBDC31C58E5DD689@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Rick, I did some reshuffling of socket-upcalls recently in the TCP stack, to prev= ent some race conditions with our $work in-kernel NFS server implementation= . Just mentioning this, as this may slightly change the timing (mostly delay = the upcall until TCP processing is all done, while before an in-kernel cons= umer could register for a socket upcall, do some fancy stuff with the data = sitting in the socket bufferes, before returning to the tcp processing). But I think there is no socket data handling being done in the upstream in-= kernel NFS server (and I have not even checked, if it actually registers an= socket-upcall handler). https://reviews.freebsd.org/R10:4d0770f1725f84e8bcd059e6094b6bd29bed6cc3 If you can reproduce this easily, perhaps back out this change and see if t= hat has an impact... NFS server is to my knowledge the only upstream in-kernel TCP consumer whic= h may be impacted by this. Richard Scheffenegger -----Urspr=FCngliche Nachricht----- Von: owner-freebsd-net@freebsd.org <owner-freebsd-net@freebsd.org> Im Auftr= ag von Rick Macklem Gesendet: Freitag, 19. M=E4rz 2021 16:58 An: tuexen@freebsd.org Cc: Scheffenegger, Richard <Richard.Scheffenegger@netapp.com>; freebsd-net@= freebsd.org; Alexander Motin <mav@FreeBSD.org> Betreff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or o= pen attachments unless you recognize the sender and know the content is saf= e. Michael Tuexen wrote: >> On 18. Mar 2021, at 21:55, Rick Macklem <rmacklem@uoguelph.ca> wrote: >> >> Michael Tuexen wrote: >>>> On 18. Mar 2021, at 13:42, Scheffenegger, Richard <Richard.Scheffenegg= er@netapp.com> wrote: >>>> >>>>>> Output from the NFS Client when the issue occurs # netstat -an |=20 >>>>>> grep NFS.Server.IP.X >>>>>> tcp 0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 = FIN_WAIT2 >>>>> I'm no TCP guy. Hopefully others might know why the client would=20 >>>>> be stuck in FIN_WAIT2 (I vaguely recall this means it is waiting=20 >>>>> for a fin/ack, but could be wrong?) >>>> >>>> When the client is in Fin-Wait2 this is the state you end up when the = Client side actively close() the tcp session, and then the server also ACKe= d the FIN. >> Jason noted: >> >>> When the issue occurs, this is what I see on the NFS Server. >>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.51550 = CLOSE_WAIT >>> >>> which corresponds to the state on the client side. The server=20 >>> received the FIN from the client and acked it. >>> The server is waiting for a close call to happen. >>> So the question is: Is the server also closing the connection? >> Did you mean to say "client closing the connection here?" >Yes. >> >> The server should call soclose() { it never calls soshutdown() } when=20 >> soreceive(with MSG_WAIT) returns 0 bytes or an error that indicates=20 >> the socket is broken. Btw, I looked and the soreceive() is done with MSG_DONTWAIT, but the EWOULD= BLOCK is handled appropriately. >> --> The soreceive() call is triggered by an upcall for the rcv side of t= he socket. >> So, are you saying the FreeBSD NFS server did not call soclose() for thi= s case? >Yes. If the state at the server side is CLOSE_WAIT, no close call has happ= ened yet. >The FIN from the client was received, it was ACKED, but no close() call=20 >(or shutdown(..., SHUT_WR) or shutdown(..., SHUT_RDWR)) was issued.=20 >Therefore, no FIN was sent and the client should be in the FINWAIT-2=20 >state. This was also reported. So the reported states are consistent. For a test, I commented out the soclose() call in the server side krpc and,= when I dismounted, it did leave the server socket in CLOSE_WAIT. For the FreeBSD client, it did the dismount and the socket was in FIN_WAIT2= for a little while and then disappeared (someone mentioned a short timeout= and that seems to be the case). I might argue that the Linux client should not get hung when this occurs, b= ut there does appear to be an issue on the FreeBSD end. So it does appear you have a case where the soclose() call is not happening= on the FreeBSD NFS server. I am a little surprised since I don't think I'v= e heard of this before and the code is at least 10years old (at least the p= arts related to this). For the soclose() to not happen, the reference count on the socket structur= e cannot have gone to zero. (ie a SVC_RELEASE() was missed) Upon code inspe= ction, I was not able to spot a reference counting bug. (Not too surprising, since a reference counting bug should have shown up l= ong ago.) The only thing I spotted that could conceivably explain this is that the fu= nction svc_vc_stat() which returns the indication that the socket has been = closed at the other end did not bother to do any locking when it checked th= e status. (I am not yet sure if this could result in the status of XPRT_DIE= D being missed by the call, but if so, that would result in the soclose() c= all not happening.) I have attached a small patch, which I think is safe, that adds locking to = svc_vc_stat(),which I am hoping you can try at some point. (I realize this is difficult for a production server, but...) I have tested= it a little and will test it some more, to try and ensure it does not brea= k anything. I have also cc'd mav@, since he's the guy who last worked on this code, in = case he has any insight w.r.t. how the soclose() might get missed (or any o= ther way the server socket gets stuck in CLOSE_WAIT). rick ps: I'll create a PR for this, so that it doesn't get forgotten. Best regards Michael > > rick > > Best regards > Michael >> This will last for ~2 min or so, but is asynchronous. However, the same = 4-tuple can not be reused during this time. >> >> With other words, from the socket / TCP, a properly executed active=20 >> close() will end up in this state. (If the other side initiated the=20 >> close, a passive close, will not end in this state) >> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?SN4PR0601MB372895EE1F6DDFA830D4B7AC86689>