From owner-freebsd-net@freebsd.org Sat Apr 10 21:59:55 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8DB3A5D968E for ; Sat, 10 Apr 2021 21:59:55 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on0602.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::602]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHpnk548fz4V22; Sat, 10 Apr 2021 21:59:53 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ANsnes1AOTI1mHvBgjmicwZE0MWpnNGyDUFyBplTgF41oGgOG4I/ugyxSTDtu+D2rIvuoF1m9VK3Sk/Pq8wTt9bS4WV8zYPvq5j0hwrqEgKMpdUEWmrdCRIAXjCnEnvtL5pFzWDpV/vHbbKC6TsCn49rpXYDhvO93AIjikk+YvkTZWSlHWr4kmoQFjqpWtmdjr03FEPBISTFLvXt/XEbM77122wpby3LJw+z9G2pBjZJadbzmBFidp9yiQ+eyvJ4uy6zf+Q1LLUyrGeOdikhf+CRnTTqbbhfYYqBcSjY8Gzf9J1ap5SK4I13pTWBcsAAmJQ3DQ0O/oN0k7YgnGJPMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iJ9E9SvpEVZLDzw1s3Xzs/c5jWNmT8yTLAlTxn2RRMA=; b=R5gs/OA2COr4cbnlMbITTDbFH2U4j3vZOXpjdLwkpbA1LJ2MjrVvi+SmYNC/65Nxr4/GvOgkZ5fFwddAurOFrqA5slLu3LzH8KVaPWH7SXZFcYBR0O3nt/RqgBdGToYt72u8WW7EtyMSiBCtBmUzeAobPoFN0oAmw54T14OApPELRCu7+WYx01WY8tPhb7Ilwp7nkiTE+7lfLrrKIOGv2ohryvTwz54qVb9jRA6M2MLcoL1/Mhz32rOn2pCT9bo91LW/ZYxlyn41SKtDOsqVI6s6SvFGfBhvmuBgyAMkU6IdmZ0ujbOEI1rYib7jrB21lGuRujzTnZMVOKNBkTrFJw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iJ9E9SvpEVZLDzw1s3Xzs/c5jWNmT8yTLAlTxn2RRMA=; b=BX7qbj5JyGA9vLf+39kxnXkZf7OIuXthTJ4rOUxHT7V3lgleR1emNhuAca7Sdk57HKwTLapSY/tKKnbX/vlz9HE4E3kU05ozxsP6A3emJialOMuf+cEUWGAI4T/Y7lztFbljVP/p/fcgrsbFSpN2X2mocXzOIZYaoA1xuaDkt9bO8dDjKZVxGNCd3r/n8iuol2qDDhvKy8u8Q+g2/cqGlzrPBcF3la/HDTuZuLPZj3C2DdwXHf2QH9EEa3npWXQB7MF2wNI4UnH7jK1+d1Pfq3nFGMVkUdbngXDA7544ilk27ud4oOJGdgPZgiigOpfdjfkal6r+G444YlsJ6Io+cQ== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR0101MB2246.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:21::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.22; Sat, 10 Apr 2021 21:59:51 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.035; Sat, 10 Apr 2021 21:59:51 +0000 From: Rick Macklem To: "tuexen@freebsd.org" CC: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAG5EAgAA+DUKAAB1JAIAACEqkgAEOcgCAAI4UZoAAhYMAgAXXgNmAAJVDAIAAMi2AgAAnewCAAAnWO4AAD/IAgABa2UE= Date: Sat, 10 Apr 2021 21:59:51 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org> , <3980F368-098D-4EE4-B213-4113C2CAFE7D@freebsd.org> In-Reply-To: <3980F368-098D-4EE4-B213-4113C2CAFE7D@freebsd.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 0f56ddfb-a159-4ce3-32ac-08d8fc6bf765 x-ms-traffictypediagnostic: YQXPR0101MB2246: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:2201; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: olMviBitUQQEC8kHwTmk3O9iZnOT7A7+kOMr1+tnfmkQQBWyEjuOeA/bD4n+M40vALJFDvEUAl8IoZHG+G26ZGMQXIzdWqupO1RIhl2cWwpGeVFHIV2DGzo5GspxrfDPTLVywCAVhyRXY8NWnwGGB8h75rlygLYVvwS2RFrdEssxpMlXbI5nKWaBluMKms5QBIW1xXGKmHWcAPni4PUw1xVVODmDIFmkrQ9KfxXc+VWTtUb8sHU4aSbWOdp2gPOQOA7jNLdQFJK9b0WtYo431BVq2O2wVHQtKk7ca3rtr7aIYb3DhMHEZS7/q/GQdIM0xztErw8ZrMym4ruzrwvsGypnc8mhmwR7Ah8AoDEkUjw8j2eWMMkS4DjHBPT3xM+uSgDTBGBA5QYbyof5+bqTCHjEHshqwEp7hnMToIbmr6ghOEJQiFuZy34vf4W94bFV5yN0H8N1//H0FuimwvlkJm1JrODk/mmdsstpMTB2ZC2zjYhFcRS21iVUFng/8n56Ge6232CLZC7lR9e09GHYq3LwzRcGK02Jp/G1CaK7WEsxx9j0xKUGRkMlR+bnBQ6ifJdjDCcta6LZJROeQKeKW0cE4Zz9wWi9TJ8Clv14ROJSnVXHhTBJaH776iX2pSEb8pcOoMieT9onu9Dzm8P+de2l/CZZNEljAA9CI18sIo1IKvhAFqF0lYk5DDLIiRvf x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(366004)(136003)(376002)(39850400004)(396003)(346002)(6916009)(8936002)(5660300002)(33656002)(55016002)(8676002)(9686003)(86362001)(186003)(6506007)(478600001)(84040400003)(52536014)(7696005)(91956017)(66946007)(7116003)(76116006)(966005)(38100700002)(316002)(786003)(4326008)(83380400001)(54906003)(3480700007)(66556008)(66476007)(2906002)(71200400001)(66446008)(64756008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?1viUa+q6iIH7XW3/OW7S6eGLZ87Lu1gpqLBOPKIYXHNtmycNun73xvNF?= =?Windows-1252?Q?Ez25w4CvXkQKnswrwW2+58N0rzacZhOMJJNLw8QsTJg94HWtElpD7xkn?= =?Windows-1252?Q?LUq6CWCUmY4Jfg/WvDB3CBqPMntKqIkxLgeR4+IYp9T4i4viKP9Scfu0?= =?Windows-1252?Q?RVoPwPr01n6mF61aeIIrwS4b6lWInaDCyI/acPTvWGnHx8cxF9lr7nGj?= =?Windows-1252?Q?g6GATg1AvGYdldhUm7ucvAhZfImoXTxVtcFRLikAKboaarKQxTaZf3xX?= =?Windows-1252?Q?Cnn+m5gDJZtv8U2H4DPNOl8SpIvO9dxUKlP6fEh0bhCXF2QGzxzg4I/o?= =?Windows-1252?Q?+Pv6tj4IjRCdCly5X1DB7xK7oct4AkJvAgrk/HhsHPWlD93nHIRUqef3?= =?Windows-1252?Q?thuT74JIo17hwK942pEkQI7LOHRX3ebx355j9hQb5NQhSTYPfRh8awR2?= =?Windows-1252?Q?KU26FQjvKd96ODS2fCM0C5cXQhoYuUad6MKH5kixplXrSN5WjVbz6kRv?= =?Windows-1252?Q?9zIc/r9GaPWDlLhoBFsiuC0hQTtUeiVePpPrD/B//WcEAdxblvBA13II?= =?Windows-1252?Q?rH6VpKLJWpj4F00uPV/clqD42zjSg4Sfa/ZtDlDhltF3k0WW8QAW88Tz?= =?Windows-1252?Q?i0zej9b0yLuYW7fPID5D10V8R+Y/hJe9lM+7lqnMxL4Nr/XSgC7GgVdE?= =?Windows-1252?Q?ClpZmI8eQg9SP5GyWvAcB2xk4shPQsVJAGAIBhBVDmPAMvvopRl+r9B+?= =?Windows-1252?Q?/kIt/rDrmMnKY4yAFnVRwFlO0kkpLuc2M2cIcuKMEXG9exB1Ee4GFMY7?= =?Windows-1252?Q?TQOWbFWdBUdZk7lSNu972A0T8v35p11nuYhR/GZ6tnVOfzH612E3va0l?= =?Windows-1252?Q?3X+TTPV5sVp/En+u5zluU0obgCPohGVg/VrOl/1aWUbf07wBTxHGEYkE?= =?Windows-1252?Q?87/rEHkzsnphon4WoGOiE6R1qsiuptQr//qeYopYW21vCxCqZgqBiW/L?= =?Windows-1252?Q?7E32Uph78whHmAsK38p+25qlUxrbUdwCxwl5949jloZuCt68YAXjK1Qs?= =?Windows-1252?Q?i829MLpy6F9qFYdVL9ETpZ1gY2qwQ5z3BjnBfldb1rhpgepv4H9ZjI2i?= =?Windows-1252?Q?4lQYFScShMxIps5WE3UuFmzX7VOCvPFr/YxAVHLAG+LZmjWI8QwOeYBg?= =?Windows-1252?Q?MQvS4RjMFtji9E6yIzT4t3SAcdKnVuOtxEttxK9MytylqALSHHukPCwS?= =?Windows-1252?Q?d9T/csc7MsXCRF2i1Hl7FDgqubEZv/g7zzUgZx16JqYx4r54g2JnNO6t?= =?Windows-1252?Q?5i9CkZVnYQznPGVMsdk5e+sTT/RDONDGCzU6po1FzyU0eFXbzaAF5JjN?= =?Windows-1252?Q?KN2yGhg6oe2sYP+0uvUF+sK20jrQp/8vVyLw7r9nnygO1G+o7wHoShOQ?= =?Windows-1252?Q?1UAgNs98Ae2cWHNXqsvOAO6nG9VNrVMfsos8YGEvdAM=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 0f56ddfb-a159-4ce3-32ac-08d8fc6bf765 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2021 21:59:51.4550 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: lYZ9kB08aC2F0hfGEymTyADOF0vuwJTC99O4dc/oxrGka2Ji3c+RMcjnKNO95PWGCJDUsS494VB3xRSPkd1okg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR0101MB2246 X-Rspamd-Queue-Id: 4FHpnk548fz4V22 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=BX7qbj5J; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::602 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5d::602:from]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5d::602:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_SPAM_LONG(1.00)[1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 21:59:55 -0000 tuexen@freebsd.org wrote:=0A= >Rick wrote:=0A= [stuff snipped]=0A= >>> With r367492 you don't get the upcall with the same error state? Or you= don't get an error on a write() call, when there should be one?=0A= > If Send-Q is 0 when the network is partitioned, after healing, the krpc s= ees no activity on=0A= > the socket (until it acquires/processes an RPC it will not do a sosend())= .=0A= > Without the 6minute timeout, the RST battle goes on "forever" (I've never= actually=0A= > waited more than 30minutes, which is close enough to "forever" for me).= =0A= > --> With the 6minute timeout, the "battle" stops after 6minutes, when the= timeout=0A= > causes a soshutdown(..SHUT_WR) on the socket.=0A= > (Since the soshutdown() patch is not yet in "main". I got comments, = but no "reviewed"=0A= > on it, the 6minute timer won't help if enabled in main. The soclose= () won't happen=0A= > for TCP connections with the back channel enabled, such as Linux 4.= 1/4.2 ones.)=0A= >I'm confused. So you are saying that if the Send-Q is empty when you parti= tion the=0A= >network, and the peer starts to send SYNs after the healing, FreeBSD respo= nds=0A= >with a challenge ACK which triggers the sending of a RST by Linux. This RS= T is=0A= >ignored multiple times.=0A= >Is that true? Even with my patch for the the bug I introduced?=0A= Yes and yes.=0A= Go take another look at linuxtofreenfs.pcap=0A= ("fetch https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap" if you do= n't=0A= already have it.)=0A= Look at packet #1949->2069. I use wireshark, but you'll have your favourite= .=0A= You'll see the "RST battle" that ends after=0A= 6minutes at packet#2069. If there is no 6minute timeout enabled in the=0A= server side krpc, then the battle just continues (I once let it run for abo= ut=0A= 30minutes before giving up). The 6minute timeout is not currently enabled= =0A= in main, etc.=0A= =0A= >What version of the kernel are you using?=0A= "main" dated Dec. 23, 2020 + your bugfix + assorted NFS patches that=0A= are not relevant + 2 small krpc related patches.=0A= --> The two small krpc related patches enable the 6minute timeout and=0A= add a soshutdown(..SHUT_WR) call when the 6minute timeout is=0A= triggered. These have no effect until the 6minutes is up and, withou= t=0A= them the "RTS battle" goes on forever.=0A= =0A= Add to the above a revert of r367492 and the RST battle goes away and thing= s=0A= behave as expected. The recovery happens quickly after the network is=0A= unpartitioned, with either 0 or 1 RSTs.=0A= =0A= rick=0A= ps: Once the irrelevant NFS patches make it into "main", I will upgrade to= =0A= main bits-de-jur for testing.=0A= =0A= Best regards=0A= Michael=0A= >=0A= > If Send-Q is non-empty when the network is partitioned, the battle will n= ot happen.=0A= >=0A= >>=0A= >> My understanding is that he needs this error indication when calling shu= tdown().=0A= > There are several ways the krpc notices that a TCP connection is no longe= r functional.=0A= > - An error return like EPIPE from either sosend() or soreceive().=0A= > - A return of 0 from soreceive() with no data (normal EOF from other end)= .=0A= > - A 6minute timeout on the server end, when no activity has occurred on t= he=0A= > connection. This timer is currently disabled for NFSv4.1/4.2 mounts in "= main",=0A= > but I enabled it for this testing, to stop the "RST battle goes on forev= er"=0A= > during testing. I am thinking of enabling it on "main", but this crude b= andaid=0A= > shouldn't be thought of as a "fix for the RST battle".=0A= >=0A= >>>=0A= >>> From what you describe, this is on writes, isn't it? (I'm asking, at th= e original problem that was fixed with r367492, occurs in the read path (dr= aining of ths so_rcv buffer in the upcall right away, which subsequently in= fluences the ACK sent by the stack).=0A= >>>=0A= >>> I only added the so_snd buffer after some discussion, if the WAKESOR sh= ouldn't have a symmetric equivalent on WAKESOW....=0A= >>>=0A= >>> Thus a partial backout (leaving the WAKESOR part inside, but reverting = the WAKESOW part) would still fix my initial problem about erraneous DSACKs= (which can also lead to extremely poor performance with Linux clients), bu= t possible address this issue...=0A= >>>=0A= >>> Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 = for the revert only on the so_snd upcall?=0A= > Since the krpc only uses receive upcalls, I don't see how reverting the s= end side would have=0A= > any effect?=0A= >=0A= >> Since the release of 13.0 is almost done, can we try to fix the issue in= stead of reverting the commit?=0A= > I think it has already shipped broken.=0A= > I don't know if an errata is possible, or if it will be broken until 13.1= .=0A= >=0A= > --> I am much more concerned with the otis@ stuck client problem than thi= s RST battle that only=0A= > occurs after a network partitioning, especially if it is 13.0 speci= fic.=0A= > I did this testing to try to reproduce Jason's stuck client (with c= onnection in CLOSE_WAIT)=0A= > problem, which I failed to reproduce.=0A= >=0A= > rick=0A= >=0A= > Rs: agree, a good understanding where the interaction btwn stack, socket = and in kernel tcp user breaks is needed;=0A= >=0A= >>=0A= >> If this doesn't help, some major surgery will be necessary to prevent NF= S sessions with SACK enabled, to transmit DSACKs...=0A= >=0A= > My understanding is that the problem is related to getting a local error = indication after=0A= > receiving a RST segment too late or not at all.=0A= >=0A= > Rs: but the move of the upcall should not materially change that; i don= =92t have a pc here to see if any upcall actually happens on rst...=0A= >=0A= > Best regards=0A= > Michael=0A= >>=0A= >>=0A= >>> I know from a printf that this happened, but whether it caused the RST = battle to not happen, I don't know.=0A= >>>=0A= >>> I can put r367492 back in and do more testing if you'd like, but I thin= k it probably needs to be reverted?=0A= >>=0A= >> Please, I don't quite understand why the exact timing of the upcall woul= d be that critical here...=0A= >>=0A= >> A comparison of the soxxx calls and errors between the "good" and the "b= ad" would be perfect. I don't know if this is easy to do though, as these c= alls appear to be scattered all around the RPC / NFS source paths.=0A= >>=0A= >>> This does not explain the original hung Linux client problem, but does = shed light on the RST war I could create by doing a network partitioning.= =0A= >>>=0A= >>> rick=0A= >>=0A= >> _______________________________________________=0A= >> freebsd-net@freebsd.org mailing list=0A= >> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >=0A= > _______________________________________________=0A= > freebsd-net@freebsd.org mailing list=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= =0A=