From owner-freebsd-net@freebsd.org Fri Mar 19 16:14:04 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7F91A579884 for ; Fri, 19 Mar 2021 16:14:04 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-eopbgr770049.outbound.protection.outlook.com [40.107.77.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4F288r20gKz3FbZ; Fri, 19 Mar 2021 16:14:03 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=W1yzdKyNYxlp9vCuGiS9E2qYcCJk43bMfuFIwgkQ1RVLupNaupme50cm0FtCeRBDffHYsufshJbI2RxKD6vKTqvW8/uabUIHAnm5JGSz2Z0blWYfGJHIJDWACgeBfrDzID3n/FPbZMCsqVRqCA6oe6gk0j0sxEFFWIy9vJKEJzjlOXji25o2ZUHX7oJLAku8m2qld1GCJVQfvgUcVUpVHN/79MhALc/aXMhq6A9QckvgXVOgVwkGkXBpBY+mzv++0rtezFCWXwTlut9HhYZM6KUASSvGM3PkXTgyTxkCHUpL5jAjKam7ox8G6QmHMuTMgr3WFnKOCOONO/K5cw2EDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vF2ATi1WINpDmQxzuBaZ3PCKKx9UKLJuJm5LsTZNmXU=; b=A0LjWyaXmHv111Mo3r0+DyGyT1alz5WCF8O+s0wFhD3GyR6dDJj7zr+9ZGWBKpZy/IHtazqNSydBSF0KjVtUGafVWdpqXSombF72TmYb143bpRWK+uatDZwF1TF8I9XvHj17/mj2Ftsz1X09I0T+lb16bj4eywwFV2o7Bg5yHKoIMKfnN0XfVHgARDGrlaaVdozA9mLMQ6ABuLGHckBfXSb6xPITnnuSCA7YGEOAatmmGazC4b7vLc6+7+yyHoODG3J031zbXa5+ug2DBtgsSd1k33Tz/WnAiVb3VUlcYao0C339yZ2AF/j4LI9TzoS8LWWrRbMPJ1bwmjt2f8+YQA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=netapp.com; dmarc=pass action=none header.from=netapp.com; dkim=pass header.d=netapp.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netapp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vF2ATi1WINpDmQxzuBaZ3PCKKx9UKLJuJm5LsTZNmXU=; b=Q6mgdXH+KX8I8mBXhRq0NYHmQJQ03wlGFhZSM8KCt5bbVRILAtUvOfrNtuqbPEu3h6Gz/WPHAnIatbc0477cu/9zt2PnxZipPHrV2rMUhXAvvRx916TKiNdPKkfs5y0y9qrPoflFRUpZIuJbYFg9rt0T4TrC435DszP+aOgBWKiTECd8eOg+ILWcjNgRPskiHO3kwcn7Wd5lxWDHPzVsSkLN6jAx++Lic8A+IKcP97PqncicvWBIVsPQ2tyIoP2lqfyBr1bIgXJ53oJp0F2GGp6GSTJ9T2wXZ1O/6kLmqgQ8Rlyq9UZdwrsGNtEeOAs5ceR+iiuhCy0CwG74SuQLAQ== Received: from SN4PR0601MB3728.namprd06.prod.outlook.com (2603:10b6:803:51::24) by SN6PR06MB3952.namprd06.prod.outlook.com (2603:10b6:805:1f::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3890.19; Fri, 19 Mar 2021 16:14:01 +0000 Received: from SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef]) by SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef%6]) with mapi id 15.20.3955.018; Fri, 19 Mar 2021 16:14:01 +0000 From: "Scheffenegger, Richard" To: Rick Macklem , "tuexen@freebsd.org" CC: "freebsd-net@freebsd.org" , Alexander Motin Subject: AW: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1GB6agsoGWN0UqRoZFo/qoHTaqItMcAgAD7PVCAAAUggIAAhiqAgAACBoCAAT1QAIAAAnhQ Date: Fri, 19 Mar 2021 16:14:01 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <2890D243-AF46-43A4-A1AD-CB0C3481511D@lurchi.franken.de> , <9EE3DFAC-72B0-4256-B57C-DE6AA811413C@freebsd.org> In-Reply-To: Accept-Language: de-AT, en-US Content-Language: de-DE X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [77.119.130.93] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 0614c1c5-f13a-401a-f160-08d8eaf20256 x-ms-traffictypediagnostic: SN6PR06MB3952: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:5236; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: B3sjzok58fEvcj72ZuQie39FGpzAuEkxryIpstZ7AZTPHFesOB7OjA59ROzdBCpF3rFEL9h7U74khELAcfqv4X27qe9oRTQ7fq6heALCRQNpfboFak4vtcc+Jkj7qPpznMs0PeRE97iMK9Mt9YWEfdRQboX0h44hWHFnPJgVBB/wvigjMGgkVj1sq+CyOt99cDss+lyItkqZdzQqAeMQ4AbCZW36MY23qOGHmpAPSR0JvV2MxcLS8w6a6ZqmynTBNZXcDPUYdwxJsbYzOUO95HNq3C/5BeEdOUh9e8w4q9GkdMe482IF+Wd0CcoOSFAUefcqNOLKyOukt8YR7Aj83HaFscoaoGbw+29ceM7L+ekCWz9HpnbOjehFs5Znb+E2DG7E7lWieGQpG22JGajh9VJfqgor1Kr3VEs/JwS4MywyHenFvnoS2sqfEy4GyyCwweMOTo3OHJxPJa/Xrn7mGSvXZEJb2CN+MxcZJgSF445PHujNHHItLlp2alXyZXlmGb0ngqUaE7Lwhf3O7wVUZZWDCjzSfbg3N0pywN9hm7dMyG6gMSj/eyHyn/UXPVnveV30EIsix59RwxV2w0xyS3NJuS7qiyBhzF3E6FIUdUDMxk6JWX3U79IndIpEYl6o2o1qdI+O+t08H1+xJrf0tYzktqhS+keYsO5SDzgxbyI= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN4PR0601MB3728.namprd06.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(136003)(39860400002)(376002)(346002)(396003)(54906003)(186003)(6506007)(8936002)(86362001)(26005)(110136005)(71200400001)(33656002)(66446008)(7116003)(76116006)(4326008)(38100700001)(296002)(64756008)(8676002)(66574015)(478600001)(83380400001)(66556008)(316002)(66946007)(52536014)(966005)(55016002)(7696005)(2906002)(66476007)(5660300002)(9686003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?kYL/q9x6RZDnIPIFvTwcJnUuYAh5e/W5xhqUAYL0Twr2ucjTwwygnY8l2D?= =?iso-8859-1?Q?JShVdv3UeF5JKACH6l+EFa5yQGyCYY1seSuVeCv00GvMIdQOjvzQgmbwxv?= =?iso-8859-1?Q?/pxqF6EIXwtfGOw0vgH4SUWR+btD5NjTTmNd2dZ0a0M2zgsPEd+ze7WX7Z?= =?iso-8859-1?Q?XOMY5atRC3tlGwyQfNPEhq6lLgXQc60gaKhNM/YxSkfQeLOlqchK1369qh?= =?iso-8859-1?Q?5LdtPyQrtnC37e1cjJoOlOj2GKaIVuTblcrY7Kkgg5Ey8gfxBmJNYoqZDq?= =?iso-8859-1?Q?enVsSAatr59sKmz5LSw6i7pd4ZlfavozInq6DUQtRHwBxJFmPfjU2qtyQV?= =?iso-8859-1?Q?rS9tCiSBNMtP/gULsPsR+iRyvC9+VEOTRPTCgCJ1xQxkL7YqV76mPfc7RX?= =?iso-8859-1?Q?4thK1SJu0BxLZxvTXr7P/itQF/wXn5JOIR7UL0QKTrGU8fW4uUw1gyi5Ok?= =?iso-8859-1?Q?dbs9cVBiuti04ZHNc+eSxSDzOF+E7vi/v8pR9D5g6YQuH8Miyd/ytDFTMQ?= =?iso-8859-1?Q?I5F11nXW3Pz7PrfG2yEwHf1yJDF5EzUxBGCNjeYKuTxuvDc+jgfVEEXn6v?= =?iso-8859-1?Q?e63055yMrQg9fPd8r2XHWF+Z6BI+ojJb3l4n6aBjLMyb2RXVNkBcFcvWQa?= =?iso-8859-1?Q?b85sUQqFX7HYLCTYGpKhzCkIuheld8WakVNjewbi6rqgHSBnlHRUABuEc/?= =?iso-8859-1?Q?smsXBLTzyDtwLgJE0hu0NMfsFyk6qjoKmwcgwOXRXZD2sqtbhRO36z7OtS?= =?iso-8859-1?Q?E7ayqklOCcOBLAWPIaDkSaZi2LS5G8FW+FRPv0ZuICUVmc6Z1x3RqEhUQE?= =?iso-8859-1?Q?Hh8SWVCvQFDB8jmzAHxUoBxw8SIeQYvT/wsFQwJiwQK5yB4D0JfDc0a34w?= =?iso-8859-1?Q?n1MoSp+Nrx80Vgfsg8TS4qI13Z0fPzlSwyU6T64X32vn0EtbTb5BahpSjw?= =?iso-8859-1?Q?8Xdn22HS3oaq+5RL7zRg6oMd1VAiOHZD7CBPC6L4e4+4rXbyM7gKB2Y4Dw?= =?iso-8859-1?Q?LrRNiJWQN7bBOh21rTmuUqNv3XmxPdhMM6F47DCBZ1LrZ6txqAuO4MmJgN?= =?iso-8859-1?Q?WIiAzeRnV/AQ+Q+JJQo5cLCDV1D1nZsWPjsRWICtzdk3p3v3nxPK5UHr0n?= =?iso-8859-1?Q?xHlt70BxnX67Fiyi20/OucgnPmimFbgp3KAA3uQSnaknwmqesUc9cMfg0a?= =?iso-8859-1?Q?KjlWQU5IyRLj3SipXjhu4Fxct1D+YBiQfCglSFJN0YufefYPb9gTuaV6jn?= =?iso-8859-1?Q?SZsgazpfh9M/mlk5JXHzt2jkkls/FRt1FZ+YEFMVxcd4iSU+7D6svU+2dr?= =?iso-8859-1?Q?jcNN8TJM4ObP3BWsDlKmK2y1tkiNToCPq5GYy8ooPpgwHzledL0fQnkFCS?= =?iso-8859-1?Q?LjOjzEy9pm?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: netapp.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN4PR0601MB3728.namprd06.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0614c1c5-f13a-401a-f160-08d8eaf20256 X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Mar 2021 16:14:01.3773 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 4b0911a0-929b-4715-944b-c03745165b3a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: yWDMbuaV8ig8aSVmgoyEhSiFa8lLeBhWjsKvbKChSf9NDk6fV3IeBr0+Rkve2L6+dH5JyaXFtDVv0oR15B3mtg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR06MB3952 X-Rspamd-Queue-Id: 4F288r20gKz3FbZ X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Mar 2021 16:14:04 -0000 Hi Rick, I did some reshuffling of socket-upcalls recently in the TCP stack, to prev= ent some race conditions with our $work in-kernel NFS server implementation= . Just mentioning this, as this may slightly change the timing (mostly delay = the upcall until TCP processing is all done, while before an in-kernel cons= umer could register for a socket upcall, do some fancy stuff with the data = sitting in the socket bufferes, before returning to the tcp processing). But I think there is no socket data handling being done in the upstream in-= kernel NFS server (and I have not even checked, if it actually registers an= socket-upcall handler). https://reviews.freebsd.org/R10:4d0770f1725f84e8bcd059e6094b6bd29bed6cc3 If you can reproduce this easily, perhaps back out this change and see if t= hat has an impact... NFS server is to my knowledge the only upstream in-kernel TCP consumer whic= h may be impacted by this. Richard Scheffenegger -----Urspr=FCngliche Nachricht----- Von: owner-freebsd-net@freebsd.org Im Auftr= ag von Rick Macklem Gesendet: Freitag, 19. M=E4rz 2021 16:58 An: tuexen@freebsd.org Cc: Scheffenegger, Richard ; freebsd-net@= freebsd.org; Alexander Motin Betreff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or o= pen attachments unless you recognize the sender and know the content is saf= e. Michael Tuexen wrote: >> On 18. Mar 2021, at 21:55, Rick Macklem wrote: >> >> Michael Tuexen wrote: >>>> On 18. Mar 2021, at 13:42, Scheffenegger, Richard wrote: >>>> >>>>>> Output from the NFS Client when the issue occurs # netstat -an |=20 >>>>>> grep NFS.Server.IP.X >>>>>> tcp 0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 = FIN_WAIT2 >>>>> I'm no TCP guy. Hopefully others might know why the client would=20 >>>>> be stuck in FIN_WAIT2 (I vaguely recall this means it is waiting=20 >>>>> for a fin/ack, but could be wrong?) >>>> >>>> When the client is in Fin-Wait2 this is the state you end up when the = Client side actively close() the tcp session, and then the server also ACKe= d the FIN. >> Jason noted: >> >>> When the issue occurs, this is what I see on the NFS Server. >>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.51550 = CLOSE_WAIT >>> >>> which corresponds to the state on the client side. The server=20 >>> received the FIN from the client and acked it. >>> The server is waiting for a close call to happen. >>> So the question is: Is the server also closing the connection? >> Did you mean to say "client closing the connection here?" >Yes. >> >> The server should call soclose() { it never calls soshutdown() } when=20 >> soreceive(with MSG_WAIT) returns 0 bytes or an error that indicates=20 >> the socket is broken. Btw, I looked and the soreceive() is done with MSG_DONTWAIT, but the EWOULD= BLOCK is handled appropriately. >> --> The soreceive() call is triggered by an upcall for the rcv side of t= he socket. >> So, are you saying the FreeBSD NFS server did not call soclose() for thi= s case? >Yes. If the state at the server side is CLOSE_WAIT, no close call has happ= ened yet. >The FIN from the client was received, it was ACKED, but no close() call=20 >(or shutdown(..., SHUT_WR) or shutdown(..., SHUT_RDWR)) was issued.=20 >Therefore, no FIN was sent and the client should be in the FINWAIT-2=20 >state. This was also reported. So the reported states are consistent. For a test, I commented out the soclose() call in the server side krpc and,= when I dismounted, it did leave the server socket in CLOSE_WAIT. For the FreeBSD client, it did the dismount and the socket was in FIN_WAIT2= for a little while and then disappeared (someone mentioned a short timeout= and that seems to be the case). I might argue that the Linux client should not get hung when this occurs, b= ut there does appear to be an issue on the FreeBSD end. So it does appear you have a case where the soclose() call is not happening= on the FreeBSD NFS server. I am a little surprised since I don't think I'v= e heard of this before and the code is at least 10years old (at least the p= arts related to this). For the soclose() to not happen, the reference count on the socket structur= e cannot have gone to zero. (ie a SVC_RELEASE() was missed) Upon code inspe= ction, I was not able to spot a reference counting bug. (Not too surprising, since a reference counting bug should have shown up l= ong ago.) The only thing I spotted that could conceivably explain this is that the fu= nction svc_vc_stat() which returns the indication that the socket has been = closed at the other end did not bother to do any locking when it checked th= e status. (I am not yet sure if this could result in the status of XPRT_DIE= D being missed by the call, but if so, that would result in the soclose() c= all not happening.) I have attached a small patch, which I think is safe, that adds locking to = svc_vc_stat(),which I am hoping you can try at some point. (I realize this is difficult for a production server, but...) I have tested= it a little and will test it some more, to try and ensure it does not brea= k anything. I have also cc'd mav@, since he's the guy who last worked on this code, in = case he has any insight w.r.t. how the soclose() might get missed (or any o= ther way the server socket gets stuck in CLOSE_WAIT). rick ps: I'll create a PR for this, so that it doesn't get forgotten. Best regards Michael > > rick > > Best regards > Michael >> This will last for ~2 min or so, but is asynchronous. However, the same = 4-tuple can not be reused during this time. >> >> With other words, from the socket / TCP, a properly executed active=20 >> close() will end up in this state. (If the other side initiated the=20 >> close, a passive close, will not end in this state) >> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"