From owner-freebsd-net@freebsd.org Mon Apr 12 07:49:48 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E37625ED3E0 for ; Mon, 12 Apr 2021 07:49:48 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam07on2043.outbound.protection.outlook.com [40.107.95.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FJgqw0KwQz3Q9m; Mon, 12 Apr 2021 07:49:46 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=h0WLt+rZNY3BnM4NGiMoaUCtflNBHuPqQtjGU04fh3xa3PorBYnR9VvdCE8lp0kblDal190xi1FtB1BU/smYkkKMnn8A5h2tMm1HwKrkVaKrssBwJV7W9CMCsKlVBCrHuZLBAKTh30vOrgjX/D3kNUmnxkbvgyb/OLugF3+ZUaNXLg4dqFG/Y2g8MDpEiU/Bx0HyYljqQxBgh+etWwJ2q1MF7unPr59OND78qfvGw5kl65ImM3ewFV5+681zIVJVRUTfzdOJXTjDEPKwGosUYeoopXAFIC4L5aQ78dhEaeVl+I6pbsJ/saBIsk7MUXoevieLtyegZj+bhCmzmmw7RQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=hnwDDhXekNvfzvo/6iljFwacq4PsXkjk9zScO3KsEvk=; b=ERU+i0etzDe+8x7TwRERwul6mqgWz0gXq2WAlpH2rwQn2VI40lgZDcASnmwf0ql2D34MgmwGrERnIB6YDJnzDt4QEV5CrJpkhpz13NHYFCyMpfaPx+a59QUBQifmnv8d2cRh1YU0EupcQdVZYs7EGODyCXvucB7Qt5Nn+J0wVdu4q8KuIgPS3+3pneEf1bA9eYSivuaYurtWWFY4B1mZKQGOftmz2XYE4/99F/Nz/NVozofK/zlvg6qIcVQXRI12SI9IZjR2icWjZDOpp5QaQo7QKsEq1TWDGM21qmjySXHTv5pwVZJ1dEHPP/cTP578oQINBH/ZSwQ8aUXS7oOSFA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=netapp.com; dmarc=pass action=none header.from=netapp.com; dkim=pass header.d=netapp.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netapp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=hnwDDhXekNvfzvo/6iljFwacq4PsXkjk9zScO3KsEvk=; b=uhmDxQYDu74rwKQutDD6JBZOU5ehRJM0EgISnPO+9PtcoZVXhAbWIcc5TYgczSKu/CDdEQfYpx4Y1s/SPnKL5HWI82TsTwknrmq685gEXUYH+3yCZKcsw9JUYakeXZ1XTVd7teUZgxZ4BL0zqrjTo7A1l6lMpqpRYCpjxai3LyhDWL4GLBxUU0oqvIHm7CJe0F3+JHd7Fp+PV5QGcmsDuOecLyJ2oGZpTsWF5/7GpCoEyPRc0RCuDCgEJ3f3z94XuzOQP/FpR4OHXZqQZV+BNZJFuP5l8ygF8/I5RGpKxuqZu0FxUHeXzuUNvt94UOHLxb5Kq0PgJJ02hcjxL8M8Wg== Received: from SN4PR0601MB3728.namprd06.prod.outlook.com (2603:10b6:803:51::24) by SN7PR06MB7119.namprd06.prod.outlook.com (2603:10b6:806:107::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.18; Mon, 12 Apr 2021 07:49:44 +0000 Received: from SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef]) by SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef%6]) with mapi id 15.20.4020.022; Mon, 12 Apr 2021 07:49:44 +0000 From: "Scheffenegger, Richard" To: Rick Macklem , "tuexen@freebsd.org" CC: Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: AW: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1GB6agsoGWN0UqRoZFo/qoHTaqMDIkAgAL97ICACMXzgIAAsfOAgAfvbwCAAQ5PAIAAWDiAgAKBMZWAAD3WgIAAFNIAgAA/e4CAABvaAIAAEe2AgAEE0ACAAJCpAIAAgu0AgAXcwwCAAH0H4IAARSaAgAAmg3iAABY/gIAABIEAgABhAYCAAPMogIAAR//TgABlI4CAAJXdUA== Date: Mon, 12 Apr 2021 07:49:44 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org> <3980F368-098D-4EE4-B213-4113C2CAFE7D@freebsd.org> , <23F49FD9-A8B6-460F-9CD2-BBC3181A058F@freebsd.org>, In-Reply-To: Accept-Language: de-AT, en-US Content-Language: de-DE X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [185.236.167.136] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 83922d83-40cd-46c5-a8c6-08d8fd8789e1 x-ms-traffictypediagnostic: SN7PR06MB7119: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:2201; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: xKJ3knr6YVBb9DkwMqk7lWLQHPaw/qC+elhRa7j9UJO7FHTQDw+Z5khL7xVQsIjkgAulkgN4WIGp+J1qca3cK4IFfd4pFXfy5t9d0L821VE+Ww3o1BMJLwDt7cqwcaDlA4NJJ9oNjNVHmWrvGHSlwRPodiT1+UcPYWlAu9YAjsZfkFbEnkA5eWqc2sZ7EhG0n3tu0cr0mhFl/9SlOBuYBseFXHB6xDwntpwItcLB+Ylmr/C9P1SeD3I5ubYly43s3hhaeV4kHfm3pPtbR07XbuKH+3B9AndEKFvGXFfR8L+HWHok8awb6NTKYwL+1n7JsOgwjAVrEENq63cpQineQ01GHkt80/apQl3uFGwyZye2Vsu3bXKwE0VoSQ6uECdjLPPCt9v/w8XcPnYcNcdYf8xfxHGJw3f1cVdiz95w/pASzbOHf2OkAUqJPqv1q1HFkLfe9FW5GUX5ZwCxYNZbd9vr+cah/cFrEFNEpTRNpusmCjRU6sXjJewGQGSwsPitPNk/Gd9eKPdGLY/6VbUVNil+Ha1t+sd/6C5zbCO20KI5MzGnD7wxwn6H7yncdCNHJ2W69fpnHPzFHz5+qPqjcQ/cdJ6luJ+rCP+jYViGkwoVR9QK3em3swOSWqyA4v26zgMDZFy4igLzIJjW5wcPEGyv7vKnspuhRrkbyp6uv9vUxYDsFkNs5iLVlFm22x4j x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN4PR0601MB3728.namprd06.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39860400002)(396003)(346002)(136003)(376002)(366004)(83380400001)(26005)(86362001)(66446008)(64756008)(52536014)(186003)(55016002)(71200400001)(76116006)(478600001)(9686003)(7116003)(66476007)(66946007)(66574015)(966005)(33656002)(6506007)(66556008)(84040400003)(316002)(110136005)(7696005)(8676002)(53546011)(8936002)(2906002)(54906003)(296002)(5660300002)(38100700002)(4326008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?EPH/kbWDRGpNBTBq2YWue5Z88cI5ttherCLMXcAFRMB9iLLBY+VgnLUtWl?= =?iso-8859-1?Q?yuVHujfyfllufNk9maLiueWk2WpVZSHMucEju+2TRuuw8Hr40pMbrcwVfo?= =?iso-8859-1?Q?4g6BuHUtYfpnZe94w9eaf1O+2PYdloQeNe94gauSKKQuZcy3wLcENmjOMS?= =?iso-8859-1?Q?RUH/8ahBM6tNHZD4iVGFivlAjrZTER55FU4o4uPQzyivVcOPFxONAof/Q5?= =?iso-8859-1?Q?QQsNT23D5n0UNRPOeN8QPA+yo5E6NLP7gfkwggForN9UieMA/+gdvgA5eU?= =?iso-8859-1?Q?KjvFSkeJZYgh8PTebW1RZTcA55kg8JFIdyajvF+jbaQrUQD/SXp/m/Ithp?= =?iso-8859-1?Q?n0M2ZJLt/3gPfXj1QTichsGao30PEFgOD7g1NYvAstHLusT6b/KlmrJ94p?= =?iso-8859-1?Q?vfg5Wh3d6/IEOFRJPT9jA2ViL9nVI9e/92LrFJ2VmHJ4qeiGBJi2AeKn7r?= =?iso-8859-1?Q?F0JrkfDrQ8pKeP9nKVPa8WNoggBpEM3G1JdrWmUT8TzIC+pAN7Sa9EhXjP?= =?iso-8859-1?Q?yewpCm5QOaBErivhrKxQZunzMRFsaRLA7J4TNiFcGgC3C31MMaESGZzrNW?= =?iso-8859-1?Q?ZVSQ+S5XKRcGG57nGatZFgIiyx9ikxIskfCi90HHJhqJBa0M4gqtlvnZ/J?= =?iso-8859-1?Q?rr+RJSeYlfl8cztTegOn6U/rCobk50RUU9eZmDerqn6v6DERiCF6Qi3K+i?= =?iso-8859-1?Q?LvTkoTkJWFT+nprXwYOdp+KaDe1iMX6JCSNIKnQGTxGy/sJ3MOz+xg2SWH?= =?iso-8859-1?Q?uRPRjmDhsH1HhBwZh1bsTWTdS1O6sNRAjJG90lYwNwQJ3fhbxijbR6hwV8?= =?iso-8859-1?Q?gK1k4zlgLCBEyThwnNEUXfqWKTwu5gZbWguVR91UG3BiL5RQQi6u57e8OT?= =?iso-8859-1?Q?TBcL5TpMPzU5nt1hbkTMCk7UbrS9qk5ay4iDWVwUA3wZjR3x5HJKSUjUXr?= =?iso-8859-1?Q?eURcx9atsB/Otyc2KRz5IrVd/6T34BrWM5n1hecw//FONvpeJvsVZQURSD?= =?iso-8859-1?Q?jwf5mRuxwdrVgnA4XVEQpWnkZ/QlPVLWDiyS/bugdQJ4GhqNGHarSkZNmj?= =?iso-8859-1?Q?XAgnyg1Dcr/bRBJiaeTRi7NzGNc+86LSMODiarnnNuum1G9F8q8xzEozvH?= =?iso-8859-1?Q?4yuPgdW7eCPC8PzapNvTDEyCZLSL0sf1bSKbIYO7x/cUpwhAiZF677jbsR?= =?iso-8859-1?Q?g2DFSwNPbn4c7TI7u9hYr3/J2LqfP4ghFUNz2HF9avSR46El/+y3m0qfm9?= =?iso-8859-1?Q?Q+4fJLTVl0ZaTlPngbogYm3+2b/xNhFUAj26z6FPlI8RtSv2KfqEyFZoG8?= =?iso-8859-1?Q?dfly/4bXSbLu+af1Skb+e3SO2fxjv3dHn26Qde6fUKr5tHxSIr7g1DcoEt?= =?iso-8859-1?Q?yEWGem7bzY?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: netapp.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN4PR0601MB3728.namprd06.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 83922d83-40cd-46c5-a8c6-08d8fd8789e1 X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Apr 2021 07:49:44.7465 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 4b0911a0-929b-4715-944b-c03745165b3a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: OS40zsCQg7WFU0xc1hMjxMKBgZn0f7O/p+MEY9JJsv5M6et4586hVofOyZXUOhbkPy6lAtW3OofVQv9XEWd5WQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR06MB7119 X-Rspamd-Queue-Id: 4FJgqw0KwQz3Q9m X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=netapp.com header.s=selector1 header.b=uhmDxQYD; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=netapp.com; spf=pass (mx1.freebsd.org: domain of Richard.Scheffenegger@netapp.com designates 40.107.95.43 as permitted sender) smtp.mailfrom=Richard.Scheffenegger@netapp.com X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; HAS_XOIP(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[netapp.com:+]; DMARC_POLICY_ALLOW(-0.50)[netapp.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.95.43:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[netapp.com:s=selector1]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[netapp.com:dkim]; SPAMHAUS_ZRD(0.00)[40.107.95.43:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[40.107.95.43:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.95.43:from]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Apr 2021 07:49:48 -0000 I was trying to do some simple tests yesterday - but don't know if these ar= e representative: Using an old Debian 3.16.3 linux box as nfs client, and simulating the disc= onnect with an ipfw rule, while introducing some packet drops using dummyne= t (I really should be adding a simple markov-chain state machine for burst = losses), to utilize some of the socket upcalls in the tcp_input code flow. = But it got too late before I arrived at any relevant results... Richard Scheffenegger Consulting Solution Architect NAS & Networking NetApp +43 1 3676 811 3157 Direct Phone +43=A0664 8866 1857 Mobile Phone Richard.Scheffenegger@netapp.com https://ts.la/richard49892 -----Urspr=FCngliche Nachricht----- Von: Rick Macklem =20 Gesendet: Montag, 12. April 2021 00:50 An: Scheffenegger, Richard ; tuexen@freeb= sd.org Cc: Youssef GHORBAL ; freebsd-net@freebsd.org Betreff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or o= pen attachments unless you recognize the sender and know the content is saf= e. I should be able to test D69290 in about a week. Note that I will not be able to tell if it fixes otis@'s hung Linux client = problem. rick ________________________________________ From: Scheffenegger, Richard Sent: Sunday, April 11, 2021 12:54 PM To: tuexen@freebsd.org; Rick Macklem Cc: Youssef GHORBAL; freebsd-net@freebsd.org Subject: Re: NFS Mount Hangs CAUTION: This email originated from outside of the University of Guelph. Do= not click links or open attachments unless you recognize the sender and kn= ow the content is safe. If in doubt, forward suspicious emails to IThelp@uo= guelph.ca >From what i understand rick stating around the socket state changing before= the upcall, i can only speculate that the rst fight is for the new sessios= the client tries with the same 5tuple, while server side the old original = session persists, as the nfs server never closes /shutdown the session . But a debug logged version of the socket upcall used by the nfs server shou= ld reveal any differences in socket state at the time of upcall. I would very much like to know if d29690 addresses that problem (if it was = due to releasing the lock before the upcall), or if that still shows differ= ences between prior to my central upcall change, post that change and with = d29690 ... ________________________________ Von: tuexen@freebsd.org Gesendet: Sunday, April 11, 2021 2:30:09 PM An: Rick Macklem Cc: Scheffenegger, Richard ; Youssef GHOR= BAL ; freebsd-net@freebsd.org Betreff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or o= pen attachments unless you recognize the sender and know the content is saf= e. > On 10. Apr 2021, at 23:59, Rick Macklem wrote: > > tuexen@freebsd.org wrote: >> Rick wrote: > [stuff snipped] >>>> With r367492 you don't get the upcall with the same error state? Or yo= u don't get an error on a write() call, when there should be one? >> If Send-Q is 0 when the network is partitioned, after healing, the=20 >> krpc sees no activity on the socket (until it acquires/processes an RPC = it will not do a sosend()). >> Without the 6minute timeout, the RST battle goes on "forever" (I've=20 >> never actually waited more than 30minutes, which is close enough to "for= ever" for me). >> --> With the 6minute timeout, the "battle" stops after 6minutes, when=20 >> --> the timeout >> causes a soshutdown(..SHUT_WR) on the socket. >> (Since the soshutdown() patch is not yet in "main". I got comments, = but no "reviewed" >> on it, the 6minute timer won't help if enabled in main. The soclose= () won't happen >> for TCP connections with the back channel enabled, such as Linux=20 >> 4.1/4.2 ones.) I'm confused. So you are saying that if the Send-Q is=20 >> empty when you partition the network, and the peer starts to send=20 >> SYNs after the healing, FreeBSD responds with a challenge ACK which=20 >> triggers the sending of a RST by Linux. This RST is ignored multiple tim= es. >> Is that true? Even with my patch for the the bug I introduced? > Yes and yes. > Go take another look at linuxtofreenfs.pcap ("fetch=20 > https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap" if you don't =20 > already have it.) Look at packet #1949->2069. I use wireshark, but=20 > you'll have your favourite. > You'll see the "RST battle" that ends after 6minutes at packet#2069.=20 > If there is no 6minute timeout enabled in the server side krpc, then=20 > the battle just continues (I once let it run for about 30minutes=20 > before giving up). The 6minute timeout is not currently enabled in=20 > main, etc. Hmm. I don't understand why r367492 can impact the processing of the RST, w= hich basically destroys the TCP connection. Richard: Can you explain that? Best regards Michael > >> What version of the kernel are you using? > "main" dated Dec. 23, 2020 + your bugfix + assorted NFS patches that=20 > are not relevant + 2 small krpc related patches. > --> The two small krpc related patches enable the 6minute timeout and > add a soshutdown(..SHUT_WR) call when the 6minute timeout is > triggered. These have no effect until the 6minutes is up and, witho= ut > them the "RTS battle" goes on forever. > > Add to the above a revert of r367492 and the RST battle goes away and=20 > things behave as expected. The recovery happens quickly after the=20 > network is unpartitioned, with either 0 or 1 RSTs. > > rick > ps: Once the irrelevant NFS patches make it into "main", I will upgrade t= o > main bits-de-jur for testing. > > Best regards > Michael >> >> If Send-Q is non-empty when the network is partitioned, the battle will = not happen. >> >>> >>> My understanding is that he needs this error indication when calling sh= utdown(). >> There are several ways the krpc notices that a TCP connection is no long= er functional. >> - An error return like EPIPE from either sosend() or soreceive(). >> - A return of 0 from soreceive() with no data (normal EOF from other end= ). >> - A 6minute timeout on the server end, when no activity has occurred=20 >> on the connection. This timer is currently disabled for NFSv4.1/4.2=20 >> mounts in "main", but I enabled it for this testing, to stop the "RST ba= ttle goes on forever" >> during testing. I am thinking of enabling it on "main", but this=20 >> crude bandaid shouldn't be thought of as a "fix for the RST battle". >> >>>> >>>> From what you describe, this is on writes, isn't it? (I'm asking, at t= he original problem that was fixed with r367492, occurs in the read path (d= raining of ths so_rcv buffer in the upcall right away, which subsequently i= nfluences the ACK sent by the stack). >>>> >>>> I only added the so_snd buffer after some discussion, if the WAKESOR s= houldn't have a symmetric equivalent on WAKESOW.... >>>> >>>> Thus a partial backout (leaving the WAKESOR part inside, but reverting= the WAKESOW part) would still fix my initial problem about erraneous DSACK= s (which can also lead to extremely poor performance with Linux clients), b= ut possible address this issue... >>>> >>>> Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690= for the revert only on the so_snd upcall? >> Since the krpc only uses receive upcalls, I don't see how reverting=20 >> the send side would have any effect? >> >>> Since the release of 13.0 is almost done, can we try to fix the issue i= nstead of reverting the commit? >> I think it has already shipped broken. >> I don't know if an errata is possible, or if it will be broken until 13.= 1. >> >> --> I am much more concerned with the otis@ stuck client problem than=20 >> --> this RST battle that only >> occurs after a network partitioning, especially if it is 13.0 speci= fic. >> I did this testing to try to reproduce Jason's stuck client (with c= onnection in CLOSE_WAIT) >> problem, which I failed to reproduce. >> >> rick >> >> Rs: agree, a good understanding where the interaction btwn stack,=20 >> socket and in kernel tcp user breaks is needed; >> >>> >>> If this doesn't help, some major surgery will be necessary to prevent N= FS sessions with SACK enabled, to transmit DSACKs... >> >> My understanding is that the problem is related to getting a local=20 >> error indication after receiving a RST segment too late or not at all. >> >> Rs: but the move of the upcall should not materially change that; i don'= t have a pc here to see if any upcall actually happens on rst... >> >> Best regards >> Michael >>> >>> >>>> I know from a printf that this happened, but whether it caused the RST= battle to not happen, I don't know. >>>> >>>> I can put r367492 back in and do more testing if you'd like, but I thi= nk it probably needs to be reverted? >>> >>> Please, I don't quite understand why the exact timing of the upcall wou= ld be that critical here... >>> >>> A comparison of the soxxx calls and errors between the "good" and the "= bad" would be perfect. I don't know if this is easy to do though, as these = calls appear to be scattered all around the RPC / NFS source paths. >>> >>>> This does not explain the original hung Linux client problem, but does= shed light on the RST war I could create by doing a network partitioning. >>>> >>>> rick >>> >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list=20 >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"