From owner-freebsd-net@freebsd.org Sun Apr 4 01:49:21 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 46A455C663E for ; Sun, 4 Apr 2021 01:49:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4FCcCj19LVz3ktS for ; Sun, 4 Apr 2021 01:49:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 27BAB5C6459; Sun, 4 Apr 2021 01:49:21 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2760D5C65D8 for ; Sun, 4 Apr 2021 01:49:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCcCj0Pvxz3kyn for ; Sun, 4 Apr 2021 01:49:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0028B1D594 for ; Sun, 4 Apr 2021 01:49:20 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1341nKoG061410 for ; Sun, 4 Apr 2021 01:49:20 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1341nK3q061409 for net@FreeBSD.org; Sun, 4 Apr 2021 01:49:20 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 238411] [igb] wake on lan not working with Intel I210 Date: Sun, 04 Apr 2021 01:49:20 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.0-RELEASE X-Bugzilla-Keywords: IntelNetworking, needs-qa, regression X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: dutchman01@quicknet.nl X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: net@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 01:49:21 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D238411 Dutchman01 changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dutchman01@quicknet.nl --- Comment #6 from Dutchman01 --- Intel has released new/updated drivers for freebsd Version: 2.5.16 (Latest) Date: 8/11/2020 don't know why not yet updated on freebsd/src it does fix some issues for I210 among others --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-net@freebsd.org Sun Apr 4 01:50:23 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 99B8F5C6A09 for ; Sun, 4 Apr 2021 01:50:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FCcDv3XH1z3lDr for ; Sun, 4 Apr 2021 01:50:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 77D825C675E; Sun, 4 Apr 2021 01:50:23 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7782D5C692D for ; Sun, 4 Apr 2021 01:50:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCcDv2XQJz3lBX for ; Sun, 4 Apr 2021 01:50:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4815C1CF6C for ; Sun, 4 Apr 2021 01:50:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1341oNES061828 for ; Sun, 4 Apr 2021 01:50:23 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1341oNpD061827 for net@FreeBSD.org; Sun, 4 Apr 2021 01:50:23 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 238411] [igb] wake on lan not working with Intel I210 Date: Sun, 04 Apr 2021 01:50:23 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.0-RELEASE X-Bugzilla-Keywords: IntelNetworking, needs-qa, regression X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: dutchman01@quicknet.nl X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: net@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 01:50:23 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D238411 --- Comment #7 from Dutchman01 --- (In reply to Dutchman01 from comment #6) forgotten add a link https://downloadcenter.intel.com/download/15815/Intel-Network-Adapter-Drive= r-for-82575-6-and-82580-Based-Gigabit-Network-Connections-under-FreeBSD- --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-net@freebsd.org Sun Apr 4 08:55:43 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DCDFE5D028C for ; Sun, 4 Apr 2021 08:55:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FCngg5WJpz4fZs for ; Sun, 4 Apr 2021 08:55:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id BCCC95D028B; Sun, 4 Apr 2021 08:55:43 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BC76B5D028A for ; Sun, 4 Apr 2021 08:55:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCngg4lGKz4fRv for ; Sun, 4 Apr 2021 08:55:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 8ED4B22FA1 for ; Sun, 4 Apr 2021 08:55:43 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1348thUi016572 for ; Sun, 4 Apr 2021 08:55:43 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1348thR4016571 for net@FreeBSD.org; Sun, 4 Apr 2021 08:55:43 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254303] Fatal trap 12: page fault while in kernel mode ((frr 7.5_1 + Freebsd 13 Beta3) zebra crashes server when routes are populated) Date: Sun, 04 Apr 2021 08:55:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: Unspecified X-Bugzilla-Keywords: panic X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: melifaro@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: melifaro@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 08:55:44 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254303 --- Comment #22 from Alexander V. Chernikov --- All relevant patches are in 13-R. Does it fix an issue for you? --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Sun Apr 4 08:59:48 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9FA675D00D1 for ; Sun, 4 Apr 2021 08:59:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4FCnmN3Ykxz4fqR for ; Sun, 4 Apr 2021 08:59:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 7A0355D04B0; Sun, 4 Apr 2021 08:59:48 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 79BB45D015E for ; Sun, 4 Apr 2021 08:59:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCnmN2n6Lz4ffN for ; Sun, 4 Apr 2021 08:59:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4AEF922F16 for ; Sun, 4 Apr 2021 08:59:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1348xmEC017369 for ; Sun, 4 Apr 2021 08:59:48 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1348xmoB017368 for net@FreeBSD.org; Sun, 4 Apr 2021 08:59:48 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 189088] Assigning same IP to multiple interfaces in different FIBs only creates a host route for the first Date: Sun, 04 Apr 2021 08:59:48 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: Unspecified X-Bugzilla-Keywords: needs-patch, needs-qa X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: melifaro@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: melifaro@FreeBSD.org X-Bugzilla-Flags: mfc-stable12? mfc-stable11? X-Bugzilla-Changed-Fields: bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 08:59:48 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D189088 Alexander V. Chernikov changed: What |Removed |Added ---------------------------------------------------------------------------- Status|Open |In Progress --- Comment #16 from Alexander V. Chernikov --- The desired changes landed in 13-R. Due to the differences in the routing stack, I'm not going to merge them to 12/11. I'm going to close this one on April 11, unless there are any objections. --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Sun Apr 4 09:02:36 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2B7225D0942 for ; Sun, 4 Apr 2021 09:02:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FCnqc0M9gz4gRm for ; Sun, 4 Apr 2021 09:02:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 0B26C5D06CC; Sun, 4 Apr 2021 09:02:36 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0AD035D079F for ; Sun, 4 Apr 2021 09:02:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCnqb6c7fz4gPN for ; Sun, 4 Apr 2021 09:02:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id D4DA222C66 for ; Sun, 4 Apr 2021 09:02:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 13492Z0P020484 for ; Sun, 4 Apr 2021 09:02:35 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 13492ZZN020483 for net@FreeBSD.org; Sun, 4 Apr 2021 09:02:35 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 243703] fe80 route need to add manually and NDP doesn't populate itself Date: Sun, 04 Apr 2021 09:02:36 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: melifaro@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: melifaro@FreeBSD.org X-Bugzilla-Flags: mfc-stable12? mfc-stable11? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 09:02:36 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D243703 --- Comment #3 from Alexander V. Chernikov --- I'm going to proceed with case closure on April 11 unless any feedback is received. --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Sun Apr 4 11:50:54 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 991575D4A3D for ; Sun, 4 Apr 2021 11:50:54 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2076.outbound.protection.outlook.com [40.107.94.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCsYm6F8Hz4r2m; Sun, 4 Apr 2021 11:50:52 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lqs5OL0Po6blPKXEjwMPe0WLXeLtCbedRpIWPXFe21Y58kdaQH1Bhh3xeEUrYICVefS7+moHwIIQXuXRRQWUyZIJpBbay7YKiCD6qFLEawkx4UNXvTv2SGM7oQcgvz0woxz7SDoRQjAQOdwlMcm224iCJaNG3oXUWBcrlrz3eMSTS+CUzH0YgUMZiqi62lVORbFiB+EL8APXMKGdHJw+KWwygD6xwdcnGiZ3zkijG/ssKUZpjI/AR4gbZOilpWMMPV+JRXvIsxHkpU1XjQ324uc8mp6Ar5HR7/H1euDQn0RCHabrrtjnF3m3sy1BGif3K2btuCB2CqVA7oViVDr22Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WQ4lQq3MpjZWQdVaSWrbTGvFdB4K4yGOD8vrY/dYklY=; b=eR10rPVsMyulbgRWPucBtevwcsusyVXOOkG6nyrSM2egYQdxVyYkpneJA09tkS1IsYglpsbQsPj5vu5DIGLdDMgeUNzM0gAwaBr5xI3Vae5NVh38qfy1UcuDuJcKu10Xbk8jwpoiMzMuVbFyYg/6B4Md2Y7SOzKSE51l1gaPS12MTKfy+cB9GpPPnDnF2PTVq24RKxSGZOzQGgKE/jZ8RN2BPZIQbzj1uqqlUz7yHdiTOUBuBUmWeCeAadqIWIrvwbbYjXZZCBZnL+vAFw9mYbQMbPUWbv6awBX8dzAUUB3/MKTK+MfbJoz+O9TZTW4ffNE+OtdE8lSapc3upGYVNw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=netapp.com; dmarc=pass action=none header.from=netapp.com; dkim=pass header.d=netapp.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netapp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WQ4lQq3MpjZWQdVaSWrbTGvFdB4K4yGOD8vrY/dYklY=; b=GvEt8WpZHnVHHgUo7SKMwNy+MN+nr+e2jVKTStumqp2l4hjpHEWiGm2zjZcNY0XACsF8M24GS6cw0lnxyknKAwSwI2wDnxhZjQsWdn8XEkhS78tUfoKPHL5Tye5+JX4My527teRJiyE6aMiwm80o0T10yhJXQeydV8KfUCwM2QSjGWBWrB1UL6nARTI4k6f8zO/29CVv5LHuGXcXX3vs1705fx0ed6EmqWBQFMaQ+YB9OYDh4iVGi3mHPNjTkvglKtk6hNQ2U5gRDIsrvIUIIRAJqbs5LuTKG4IscHnaCDBtDS9X3vj8DhLLtPZyTtQ/bbLAx6p33AdQW6PRs6qaRA== Received: from SN4PR0601MB3728.namprd06.prod.outlook.com (2603:10b6:803:51::24) by SA0PR06MB6810.namprd06.prod.outlook.com (2603:10b6:806:bc::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.28; Sun, 4 Apr 2021 11:50:48 +0000 Received: from SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef]) by SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef%6]) with mapi id 15.20.3999.032; Sun, 4 Apr 2021 11:50:48 +0000 From: "Scheffenegger, Richard" To: Rick Macklem , "tuexen@freebsd.org" CC: Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1GB6agsoGWN0UqRoZFo/qoHTaqMDIkAgAL97ICACMXzgIAAsfOAgAfvbwCAAQ5PAIAAWDiAgAKBMZU= Date: Sun, 4 Apr 2021 11:50:47 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> , , In-Reply-To: Accept-Language: de-AT, en-US Content-Language: de-AT X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [2001:4bb8:10e:a0d5:dca1:b74c:eaaa:2a7d] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: a8985786-3aed-4083-50e2-08d8f75fe3ad x-ms-traffictypediagnostic: SA0PR06MB6810: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:397; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: eYJXnQlp/DgePNo1D5zBlDuwbnBay1JtSJu8sgR//qNJVVaZtN17ep66hDBp1XZFjr+LU2JnCpv6cAJU5efAdfVwaItc3UUDXsEW6xwUp7BX1L4KxlkNNT2+WNMSXibFG+TTUnDnHIM4GxI96dXJXtbwMpS/qR/aVELgnZS7ripWSZ4MIEc788wytnaFbGoEcK5Mi2fo0KrQfD+cTW+/S6nqo7gKJXNZ/KJrimqFyyrDf8jhPywB3R1u8fwLhog0YStoDR/EQ/9b5kDldRl0ypOjzHE1nt5HSYGiXx4Jz9zuFdOeknBI8FZ6daYMtQYpytRiPoOQvdjPueDMBABCETjFEDUGDZfxfaunMSe61K3FQyeCDSlYhmOQ+hhSlDH961nMqLuMs1VI6cvZQhl+227NqVtIRaghXr/ZBnV7sKZ1Jn/hVxfmlMOAYDsmkKoBnvjB+FjexXmUm0O1IhnulIf/fzt/oDkZNEYCIwgs9fSjzeOL4Aw4qXChrBZbZIuO+COYiArg3Vyyd1H3i8vvcTsXFkizdHwwdRaswoVU/LtSbqbOWJlOJIrgdq9dyejP/8o1y1X8k7818os4vrRQ/45c3A/eHDkazc6VEPu0UxuDOfSEER04usMzyrSBfTyxL7jq9Kpoccezha/hg9ZqKjD3pKAk30Y/zWVTyiOHtXsZG+G4dP69skgdhISkYhb57Lq6dL/kiufRpe5w5eBSTwENBpOQ1fs5HbajDSC7+XE= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN4PR0601MB3728.namprd06.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(376002)(396003)(366004)(136003)(346002)(39860400002)(53546011)(186003)(52536014)(9686003)(166002)(83380400001)(71200400001)(66574015)(7696005)(55016002)(478600001)(66946007)(316002)(8676002)(4326008)(966005)(64756008)(8936002)(66556008)(66446008)(66476007)(54906003)(3480700007)(110136005)(33656002)(91956017)(296002)(76116006)(30864003)(5660300002)(86362001)(2906002)(7116003)(6506007)(38100700001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?q9axFf6SESzv8DeqChlSghiveTxaQQLRXuwRMJCpmkxstDGuToUXOvUW?= =?Windows-1252?Q?DR7p3HyGbBcAiY8oe0A3aTarOjgK0qzddQMb295QXLWsfbQVcMyz4Ef4?= =?Windows-1252?Q?bI3K3QZBDQKiwJYu9RK3Yls1KmwH/YwtA7GOZEwMr0uwcYdyOJyYDgYW?= =?Windows-1252?Q?0Ni7120I4eypTmbHCVd6lABqnAMQ5kZCUIE2gnXKPLe7F9/xxfflYTY9?= =?Windows-1252?Q?2sdHCE2ApfaeVsvwTSenHlsVcg11SsS7hc0a1pYHmlPyX0lIeVwujtSY?= =?Windows-1252?Q?MYn+Lk1xr70C1eUz3gO8OyZAmjRawQT9yTJl0GgWlfwoXKcrrOFk4/hi?= =?Windows-1252?Q?tpCsf5Q9Y7Vs41472S3hkFdWEF4B0WG/JBi0YvX0n+Xp9zupolDiKbeP?= =?Windows-1252?Q?uBIZ6yyyElYKRQd0Iko2IQQD4M5o8XXc6MDi7BGZS1a45TkAKr/mZ7y2?= =?Windows-1252?Q?1bKccq2bci0Tj5IHd+YV3XPHa8SJcMJa6Fsu7v5XZw6FeOdRnjo+W4nB?= =?Windows-1252?Q?LFirWzGh6hPZZuP2676jNPzFlCyTpmV3lWJwlNlvrHDGfNJTexTdAJtL?= =?Windows-1252?Q?ZB6U66T4sdxPElLobhUI0dS51qWPAod69TbrUrp58cwLYyWGtE/RnifA?= =?Windows-1252?Q?0qeNMxrik+6turBW5746J1DSUT4dEEz9G80pCqfyA6r9ZSE2KDcLMVXl?= =?Windows-1252?Q?nw2B2cwzdlGWc52n5thGoXSkb+iESb0Qaickx+uUVWZ5Vmz/pKKkduAi?= =?Windows-1252?Q?ySSvT4SqOTbbWl2EFdXKosU63JdMn3GWEPXBuky3r8c9m8Ru7U289yS6?= =?Windows-1252?Q?aHSO1VnvTJi3416YcrilMzOjrwc+d0q/gYGa+iPUh9+5Wz1TVhCMpJl5?= =?Windows-1252?Q?6NBOZ0WfgCGpmZadY6WtThGks/QWLhKWxeI+8oMMKeOM2P88+QRoKb2o?= =?Windows-1252?Q?eijir3Yul1RBQ2MIpjYydsLmKchI6bYJmwJc0SaR7XWP0mA9AQB50pnG?= =?Windows-1252?Q?bUIV0LP2J5gcbjVx6D7f3t/wGTc++snqle+uJAqu7XGCpJacIuJ2ufd3?= =?Windows-1252?Q?Lp7BcjOQpTklor3t7+5GUkZF+ad38D7/0ikmO9jVYBkUFEC/MwoOyuIh?= =?Windows-1252?Q?pa9QQa+Hx1clJMrGqtU/jdX/DNiZ2JRYHmp0trl4PaA7Wd9FTkcs0T8C?= =?Windows-1252?Q?wefUMIq6OF0TczJAeFu5uyU+tNQMDI9UMGdx9368AiVZVNiEl/hMnPIt?= =?Windows-1252?Q?KqAozeAoSTstKgzklzRKqLEXgtsg/zpDJArqgRoQHPvkcQk9pC9T4ksb?= =?Windows-1252?Q?XZOfdkWLH6EZYyPWflyd2r9lziGtmrElu1ViNUaGdw9PPyGWJ4J/kZQw?= =?Windows-1252?Q?mdVv2S3otU5KDbkRkhLseQ5NVTMe+9qDv6Rp54VYQB2w0KvoNjWP0hCn?= =?Windows-1252?Q?fSIrsGQW+WDF0M5hFlmnVA9Qx12UYxIkDjrbkdaSYskToU34JPvpT7bm?= =?Windows-1252?Q?+h3Z/S5vLO4sZGXuIq4bJAwTdGMZvg=3D=3D?= x-ms-exchange-transport-forked: True MIME-Version: 1.0 X-OriginatorOrg: netapp.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN4PR0601MB3728.namprd06.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: a8985786-3aed-4083-50e2-08d8f75fe3ad X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Apr 2021 11:50:47.9879 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 4b0911a0-929b-4715-944b-c03745165b3a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 3UMw0pmWPl+bmiSAm9/0VzVzrB7Eb6Oz+cRJUR+LSrDz0uDxDoyXWmZSD7fCvdJ0QvHJd+fKKI3xijBS8UZVkw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR06MB6810 X-Rspamd-Queue-Id: 4FCsYm6F8Hz4r2m X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=netapp.com header.s=selector1 header.b=GvEt8WpZ; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=netapp.com; spf=pass (mx1.freebsd.org: domain of Richard.Scheffenegger@netapp.com designates 40.107.94.76 as permitted sender) smtp.mailfrom=Richard.Scheffenegger@netapp.com X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; HAS_XOIP(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[netapp.com:+]; DMARC_POLICY_ALLOW(-0.50)[netapp.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.94.76:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[netapp.com:s=selector1]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; SPAMHAUS_ZRD(0.00)[40.107.94.76:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[netapp.com:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[40.107.94.76:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.94.76:from]; MAILMAN_DEST(0.00)[freebsd-net] Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 11:50:54 -0000 For what it=91s worth, suse found two bugs in the linux nfconntrack (statef= ul firewall), and pfifo-fast scheduler, which could conspire to make tcp se= ssions hang forever. One is a missed updaten when the c=F6ient is not using the noresvport moint= option, which makes tje firewall think rsts are illegal (and drop them); The fast scheduler can run into an issue if only a single packet should be = forwarded (note that this is not the default scheduler, but often recommend= ed for perf, as it runs lockless and lower cpu cost that pfq (default). If = no other/additional packet pushes out that last packet of a flow, it can be= come stuck forever... I can try getting the relevant bug info next week... ________________________________ Von: owner-freebsd-net@freebsd.org im Auftr= ag von Rick Macklem Gesendet: Friday, April 2, 2021 11:31:01 PM An: tuexen@freebsd.org Cc: Youssef GHORBAL ; freebsd-net@freebsd.org <= freebsd-net@freebsd.org> Betreff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or o= pen attachments unless you recognize the sender and know the content is saf= e. tuexen@freebsd.org wrote: >> On 2. Apr 2021, at 02:07, Rick Macklem wrote: >> >> I hope you don't mind a top post... >> I've been testing network partitioning between the only Linux client >> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch >> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >> applied to it. >> >> I'm not enough of a TCP guy to know if this is useful, but here's what >> I see... >> >> While partitioned: >> On the FreeBSD server end, the socket either goes to CLOSED during >> the network partition or stays ESTABLISHED. >If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also >sent a FIN, but you never called close() on the socket. >If the socket stays in ESTABLISHED, there is no communication ongoing, >I guess, and therefore the server does not even detect that the peer >is not reachable. >> On the Linux end, the socket seems to remain ESTABLISHED for a >> little while, and then disappears. >So how does Linux detect the peer is not reachable? Well, here's what I see in a packet capture in the Linux client once I partition it (just unplug the net cable): - lots of retransmits of the same segment (with ACK) for 54sec - then only ARP queries Once I plug the net cable back in: - ARP works - one more retransmit of the same segement - receives RST from FreeBSD ** So, is this now a "new" TCP connection, despite using the same port#. --> It matters for NFS, since "new connection" implies "must retry all outstanding RPCs". - sends SYN - receives SYN, ACK from FreeBSD --> connection starts working again Always uses same port#. On the FreeBSD server end: - receives the last retransmit of the segment (with ACK) - sends RST - receives SYN - sends SYN, ACK I thought that there was no RST in the capture I looked at yesterday, so I'm not sure if FreeBSD always sends an RST, but the Linux client behaviour was the same. (Sent a SYN, etc). The socket disappears from the Linux "netstat -a" and I suspect that happens after about 54sec, but I am not sure about the timing. >> >> After unpartitioning: >> On the FreeBSD server end, you get another socket showing up at >> the same port# >> Active Internet connections (including servers) >> Proto Recv-Q Send-Q Local Address Foreign Address (state= ) >> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 ESTABL= ISHED >> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 CLOSED >> >> The Linux client shows the same connection ESTABLISHED. But disappears from "netstat -a" for a while during the partitioning. >> (The mount sometimes reports an error. I haven't looked at packet >> traces to see if it retries RPCs or why the errors occur.) I have now done so, as above. >> --> However I never get hangs. >> Sometimes it goes to SYN_SENT for a while and the FreeBSD server >> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >> mount starts working again. >> >> The most obvious thing is that the Linux client always keeps using >> the same port#. (The FreeBSD client will use a different port# when >> it does a TCP reconnect after no response from the NFS server for >> a little while.) >> >> What do those TCP conversant think? >I guess you are you are never calling close() on the socket, for with >the connection state is CLOSED. Ok, that makes sense. For this case the Linux client has not done a BindConnectionToSession to re-assign the back channel. I'll have to bug them about this. However, I'll bet they'll answer that I have to tell them the back channel needs re-assignment or something like that. I am pretty certain they are broken, in that the client needs to retry all outstanding RPCs. For others, here's the long winded version of this that I just put on the phabricator review: In the server side kernel RPC, the socket (struct socket *) is in a structure called SVCXPRT (normally pointed to by "xprt"). These structures a ref counted and the soclose() is done when the ref. cnt goes to zero. My understanding is that "struct socket *" is free'd by soclose() so this cannot be done before the xprt ref. cnt goes to zero. For NFSv4.1/4.2 there is something called a back channel which means that a "xprt" is used for server->client RPCs, although the TCP connection is established by the client to the server. --> This back channel holds a ref cnt on "xprt" until the client re-assigns it to a different TCP connection via an operation called BindConnectionToSession and the Linux client is not doing this soon enough, it appears. So, the soclose() is delayed, which is why I think the TCP connection gets stuck in CLOSE_WAIT and that is why I've added the soshutdown(..SHUT_WR) calls, which can happen before the client gets around to re-assigning the back channel. Thanks for your help with this Michael, rick Best regards Michael > > rick > ps: I can capture packets while doing this, if anyone has a use > for them. > > > > > > > ________________________________________ > From: owner-freebsd-net@freebsd.org on be= half of Youssef GHORBAL > Sent: Saturday, March 27, 2021 6:57 PM > To: Jason Breitman > Cc: Rick Macklem; freebsd-net@freebsd.org > Subject: Re: NFS Mount Hangs > > CAUTION: This email originated from outside of the University of Guelph. = Do not click links or open attachments unless you recognize the sender and = know the content is safe. If in doubt, forward suspicious emails to IThelp@= uoguelph.ca > > > > > On 27 Mar 2021, at 13:20, Jason Breitman > wrote: > > The issue happened again so we can say that disabling TSO and LRO on the = NIC did not resolve this issue. > # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso > # ifconfig lagg0 > lagg0: flags=3D8943 metri= c 0 mtu 1500 > options=3D8100b8 > > We can also say that the sysctl settings did not resolve this issue. > > # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 > net.inet.tcp.fast_finwait2_recycle: 0 -> 1 > > # sysctl net.inet.tcp.finwait2_timeout=3D1000 > net.inet.tcp.finwait2_timeout: 60000 -> 1000 > > I don=92t think those will do anything in your case since the FIN_WAIT2 a= re on the client side and those sysctls are for BSD. > By the way it seems that Linux recycles automatically TCP sessions in FIN= _WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) > > tcp_fin_timeout (integer; default: 60; since Linux 2.2) > This specifies how many seconds to wait for a final FIN > packet before the socket is forcibly closed. This is > strictly a violation of the TCP specification, but > required to prevent denial-of-service attacks. In Linux > 2.2, the default value was 180. > > So I don=92t get why it stucks in the FIN_WAIT2 state anyway. > > You really need to have a packet capture during the outage (client and se= rver side) so you=92ll get over the wire chat and start speculating from th= ere. > No need to capture the beginning of the outage for now. All you have to d= o, is run a tcpdump for 10 minutes or so when you notice a client stuck. > > * I have not rebooted the NFS Server nor have I restarted nfsd, but do no= t believe that is required as these settings are at the TCP level and I wou= ld expect new sessions to use the updated settings. > > The issue occurred after 5 days following a reboot of the client machines= . > I ran the capture information again to make use of the situation. > > #!/bin/sh > > while true > do > /bin/date >> /tmp/nfs-hang.log > /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log > /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log > /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log > /bin/sleep 60 > done > > > On the NFS Server > Active Internet connections (including servers) > Proto Recv-Q Send-Q Local Address Foreign Address (state) > tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 C= LOSE_WAIT > > On the NFS Client > tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2 > > > > You had also asked for the output below. > > # nfsstat -E -s > BackChannelCtBindConnToSes > 0 0 > > # sysctl vfs.nfsd.request_space_throttle_count > vfs.nfsd.request_space_throttle_count: 0 > > I see that you are testing a patch and I look forward to seeing the resul= ts. > > > Jason Breitman > > > On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote: > > Youssef GHORBAL > wrote: >> Hi Jason, >> >>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote: >>> >>> Please review the details below and let me know if there is a setting t= hat I should apply to my FreeBSD NFS Server or if there is a bug fix that I= can apply to resolve my issue. >>> I shared this information with the linux-nfs mailing list and they beli= eve the issue is on the server side. >>> >>> Issue >>> NFSv4 mounts periodically hang on the NFS Client. >>> >>> During this time, it is possible to manually mount from another NFS Ser= ver on the NFS Client having issues. >>> Also, other NFS Clients are successfully mounting from the NFS Server i= n question. >>> Rebooting the NFS Client appears to be the only solution. >> >> I had experienced a similar weird situation with periodically stuck Linu= x NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based but the= y seem to have there >own nfsd) > Yes, my understanding is that Isilon uses a proprietary user space nfsd a= nd > not the kernel based RPC and nfsd in FreeBSD. > >> We=92ve had better luck and we did manage to have packet captures on bot= h sides >during the issue. The gist of it goes like follows: >> >> - Data flows correctly between SERVER and the CLIENT >> - At some point SERVER starts decreasing it's TCP Receive Window until i= t reachs 0 >> - The client (eager to send data) can only ack data sent by SERVER. >> - When SERVER was done sending data, the client starts sending TCP Windo= w >Probes hoping that the TCP Window opens again so he can flush its buffer= s. >> - SERVER responds with a TCP Zero Window to those probes. > Having the window size drop to zero is not necessarily incorrect. > If the server is overloaded (has a backlog of NFS requests), it can stop = doing > soreceive() on the socket (so the socket rcv buffer can fill up and the T= CP window > closes). This results in "backpressure" to stop the NFS client from flood= ing the > NFS server with requests. > --> However, once the backlog is handled, the nfsd should start to sorece= ive() > again and this shouls cause the window to open back up. > --> Maybe this is broken in the socket/TCP code. I quickly got lost in > tcp_output() when it decides what to do about the rcvwin. > >> - After 6 minutes (the NFS server default Idle timeout) SERVER racefully= closes the >TCP connection sending a FIN Packet (and still a TCP Window 0) > This probably does not happen for Jason's case, since the 6minute timeout > is disabled when the TCP connection is assigned as a backchannel (most li= kely > the case for NFSv4.1). > >> - CLIENT ACK that FIN. >> - SERVER goes in FIN_WAIT_2 state >> - CLIENT closes its half part part of the socket and goes in LAST_ACK st= ate. >> - FIN is never sent by the client since there still data in its SendQ an= d receiver TCP >Window is still 0. At this stage the client starts sending = TCP Window Probes again >and again hoping that the server opens its TCP Win= dow so it can flush it's buffers >and terminate its side of the socket. >> - SERVER keeps responding with a TCP Zero Window to those probes. >> =3D> The last two steps goes on and on for hours/days freezing the NFS m= ount bound >to that TCP session. >> >> If we had a situation where CLIENT was responsible for closing the TCP W= indow (and >initiating the TCP FIN first) and server wanting to send data w= e=92ll end up in the same >state as you I think. >> >> We=92ve never had the root cause of why the SERVER decided to close the = TCP >Window and no more acccept data, the fix on the Isilon part was to rec= ycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwait2_= recycle=3D1 & >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recyc= led and at the next >occurence of CLIENT TCP Window probe, SERVER sends a R= ST, triggering the >teardown of the session on the client side, a new TCP h= andchake, etc and traffic >flows again (NFS starts responding) >> >> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 was = >implemented on the Isilon side) we=92ve added a check script on the client= that detects >LAST_ACK sockets on the client and through iptables rule enf= orces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr --sp= ort $local_port -j REJECT >--reject-with tcp-reset (the script removes this= iptables rule as soon as the LAST_ACK >disappears) >> >> The bottom line would be to have a packet capture during the outage (cli= ent and/or >server side), it will show you at least the shape of the TCP ex= change when NFS is >stuck. > Interesting story and good work w.r.t. sluething, Youssef, thanks. > > I looked at Jason's log and it shows everything is ok w.r.t the nfsd thre= ads. > (They're just waiting for RPC requests.) > However, I do now think I know why the soclose() does not happen. > When the TCP connection is assigned as a backchannel, that takes a refere= nce > cnt on the structure. This refcnt won't be released until the connection = is > replaced by a BindConnectiotoSession operation from the client. But that = won't > happen until the client creates a new TCP connection. > --> No refcnt release-->no refcnt of 0-->no soclose(). > > I've created the attached patch (completely different from the previous o= ne) > that adds soshutdown(SHUT_WR) calls in the three places where the TCP > connection is going away. This seems to get it past CLOSE_WAIT without a > soclose(). > --> I know you are not comfortable with patching your server, but I do th= ink > this change will get the socket shutdown to complete. > > There are a couple more things you can check on the server... > # nfsstat -E -s > --> Look for the count under "BindConnToSes". > --> If non-zero, backchannels have been assigned > # sysctl -a | fgrep request_space_throttle_count > --> If non-zero, the server has been overloaded at some point. > > I think the attached patch might work around the problem. > The code that should open up the receive window needs to be checked. > I am also looking at enabling the 6minute timeout when a backchannel is > assigned. > > rick > > Youssef > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/fr= eebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZci= Lj4ObR_drJNv5yryaERfIbhKR2d$ > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Sun Apr 4 12:57:36 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 414FB5D666D for ; Sun, 4 Apr 2021 12:57:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4FCv2m0znXz4vFv for ; Sun, 4 Apr 2021 12:57:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 1E5CC5D6950; Sun, 4 Apr 2021 12:57:36 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 1DFD15D6473 for ; Sun, 4 Apr 2021 12:57:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCv2m05Hvz4vHt for ; Sun, 4 Apr 2021 12:57:36 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E905725C67 for ; Sun, 4 Apr 2021 12:57:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 134CvZSn042863 for ; Sun, 4 Apr 2021 12:57:35 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 134CvZBc042862 for net@FreeBSD.org; Sun, 4 Apr 2021 12:57:35 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254303] Fatal trap 12: page fault while in kernel mode ((frr 7.5_1 + Freebsd 13 Beta3) zebra crashes server when routes are populated) Date: Sun, 04 Apr 2021 12:57:35 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: Unspecified X-Bugzilla-Keywords: panic X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: a.ivanov@veesp.com X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: melifaro@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 12:57:36 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254303 --- Comment #23 from Aleks --- (In reply to Alexander V. Chernikov from comment #22) For me - yes. Thank you very much! --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Sun Apr 4 13:00:11 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 58DF65D6974 for ; Sun, 4 Apr 2021 13:00:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FCv5l1K4Vz4vjM for ; Sun, 4 Apr 2021 13:00:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 25A9D5D69AD; Sun, 4 Apr 2021 13:00:11 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 255A05D61FD for ; Sun, 4 Apr 2021 13:00:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCv5l0Frbz4vnp for ; Sun, 4 Apr 2021 13:00:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id EF18425FFE for ; Sun, 4 Apr 2021 13:00:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 134D0APo043364 for ; Sun, 4 Apr 2021 13:00:10 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 134D0AG7043363 for net@FreeBSD.org; Sun, 4 Apr 2021 13:00:10 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254303] Fatal trap 12: page fault while in kernel mode ((frr 7.5_1 + Freebsd 13 Beta3) zebra crashes server when routes are populated) Date: Sun, 04 Apr 2021 13:00:11 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: Unspecified X-Bugzilla-Keywords: panic X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: melifaro@FreeBSD.org X-Bugzilla-Status: Closed X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: melifaro@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status resolution Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 13:00:11 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254303 Alexander V. Chernikov changed: What |Removed |Added ---------------------------------------------------------------------------- Status|In Progress |Closed Resolution|--- |FIXED --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Sun Apr 4 15:27:24 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 617F35A9FEC for ; Sun, 4 Apr 2021 15:27:24 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-qb1can01on061b.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5c::61b]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FCyMc0zqnz3LLq; Sun, 4 Apr 2021 15:27:23 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kyTVsNTv7P2dmwpD6lOVMQEiw90maJYLI/w15YPYPLn8PsLeXtgBCYDshwtocLwVZuNq5/K4S6rV1Y02Ktzxr5wOoQO7O6NOQ7n+RCCorc63zw2xhX4MZQbKPHBryZBgWKiuRniQdaunVDIB+VFBKPOhKQVEEtpshd8IETPVQca3VuKE1ytQ9AZy5UDiivMuEQ2aVSwl8E2scS0HwDdtfFmsSPIuthWV1IpBUJPtDamaN4cKgA5ignZ4VBZ64uzRaXXxyLkhn5wLx7t/Y99SUQGavy1lzNvRkwivH9GnfPDvs90GUmijvl9LsM31QRCyDKaF26yauS5OkcFRR7Z6KQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=hrQxUnfspw5wJut43aEeMtjpQKhbgAAi+3JZ8VV6aOs=; b=kL1JSPlOgI8IQq+ZHe2NZ1QYtXieihEAxWMxgL1r4KNUdE+3j8biCYXoy1N3K19TWEQPrPuRtz0/1ka29MMx0maUxRYpCscYQLS6+n95YyvRNIRbq1jPw065gH6llcZ0oBLlfuRb0kqMhatXrcPWOtcYCWSfS36walm72KZJXqWBYNtxVust05tsl3YjUyQU22O7QltkhRIpKSl4dah6raV1LfRwS0Euee+aV1ruszs5wOyWeg7POOuu7EWRv53Xa3/TbSlmHsjE4/Z8hFCwldvk7fIkFnRYWj8quimAkNYHi602NZ+mgLPTPxQKRpqK+PkBcsfZJrpjrlgVSxANVw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=hrQxUnfspw5wJut43aEeMtjpQKhbgAAi+3JZ8VV6aOs=; b=RyngLFYHSM0YxP2NOMip+Tk06HokBbgSB/uxFXE99WMCfnZUtmq1BT7c7AQxMiM+oUn2NdB21nK+WrwfhkXIEEXDezxzkZsO52y2rHbbZMhLH4GClTbkHXoE7oWKYctv8lwbo8l/rDqnrc4VP5/5jqQgZuoD3fijHodfsPvWJEyj6uzdbSl4E/qwLsP1ddsVT22/ntxuIiJa+7s0jZRirp2KZ9TRfDs2+m8sGw22xbfP31avwuyJlZmDn6Th62pchoQqPfyx5Ffy8fwipNRVGXASdlGUZtnV//aIM7z7X5/f5zDN64bZK/DW9bqKnTAkMAw422MFAt1iESXNBm7FSw== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by QB1PR01MB3122.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:33::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.32; Sun, 4 Apr 2021 15:27:20 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.032; Sun, 4 Apr 2021 15:27:15 +0000 From: Rick Macklem To: "Scheffenegger, Richard" , "tuexen@freebsd.org" CC: Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73Q== Date: Sun, 4 Apr 2021 15:27:15 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> , , , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 99b9dc4f-60ab-4ab7-6382-08d8f77e20b5 x-ms-traffictypediagnostic: QB1PR01MB3122: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:226; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: vPtgfvw6lLpZZbNevE34RQrjrb0aYgw79gzT2dPGeYoQFFt/v2zxdRLGuM+fmdDwmdbdjtxe97VFGhmmcZg5n+c6ZwzOuWV7VivdN9JpS8t8UrWNfvn/XsMjA9m8b9c0KRoKtdTyCI9KTe0PxIeGkETfLVvh8GzF75MFuMe/WoyB/cxHktJ5AqMvqGrStR49wx7rWeFKq46wTGQ0SLE4iCyZHFcZ+uios0B3LlQE8e6jsFCnKCc1jdEkMb4/DJVr8Cbj6+Io7DuB7o2/hxeWhtg9gPjvqxjDMoW6k/JMgAs99HXNmmQaN3ZDxQtaP1iA3f02jBXuTtzOOiPA0IoTkRTczlAiyGmWZCDtJMKx5arG7c94tkvqPnbyZmm7Kfz7HxR4QCskv59ag/TykaTtPcgk+afDTkkViKMFAMFyV31LwoJbfyBkqSpvXeybJAEs2/D4je3hGkish3YPC+4Zzk6hWyijaOH+KfmADFVWmAbD7JqlIUWNjYBwGDlcXFZ55HrY42vhY50CKxTcwzWIEoa1By/dGKGaIMAHlf4H/sPb1Yr5Ccn/wF5P0EyyLRQNfm7jNOJVoVomqr7znYq0sfV+oMiD4BTBoxSe9QRedC3YxPU6byO/SriRs5DTIgt2oJu7eDGjG2rU2zAgT3s6sIUXHJwKG8mYq2Z2ZMht11UfOnGUe2/d5CSxqkcqWD28xVlMWqtQVbGjNqf8C0ij4iuSZsRpbr41f0uu3P9v8Jk= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(39850400004)(376002)(396003)(136003)(366004)(346002)(54906003)(966005)(8936002)(186003)(7116003)(9686003)(110136005)(8676002)(7696005)(71200400001)(38100700001)(52536014)(478600001)(33656002)(66574015)(66946007)(53546011)(55016002)(66556008)(6506007)(786003)(86362001)(91956017)(30864003)(66446008)(76116006)(64756008)(4326008)(5660300002)(2906002)(3480700007)(316002)(83380400001)(66476007)(579004)(559001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?D02Dmpr+ktFMzdA/v5Q8E6K8xO7oWfDZbgwo++jRQQjeLvnn3dwZUS/C?= =?Windows-1252?Q?gZNceHKW6kGno6PSOg8NcDWF8MWiFKnCDcS60KByS+TubQbDuG516lmL?= =?Windows-1252?Q?Uo2PHrPUQPgj3+iN4BYY6xRUghbFprjhbDEef4UAw4LxbGmERxVIFdBU?= =?Windows-1252?Q?HHuABBZsTe3rYtIYhfkqjR9vthTttWrc7eXtS/fZ3hWoH+e3N0py88ah?= =?Windows-1252?Q?ji0ZQ9J71BsOsEu5TmHrBgs/FFKqCz9G1potFvvToguEsN/w4OMeSLfk?= =?Windows-1252?Q?QDJBa1ZofF38bOnxhtn1w1aXHECZjO2kPUkd+nuOMiQxrZSb28G0C6C/?= =?Windows-1252?Q?AVyB74lyKDfBfeQ9TGMTKWMdsfFzlRFhnhwVPyD16mQnxmYOnTTV7Td8?= =?Windows-1252?Q?vVfSEYVYVZQTX7ievLczFZFMN81HGC+Q8w6MbZGodO8dAL5CvJ8ugka0?= =?Windows-1252?Q?uiwjNvPhV4s7ilhfFVq5DcKyYvHdupJYH3tOkHX43/4ytyoiJZ0QH0je?= =?Windows-1252?Q?t5eldNlnKy6EDiuwAlZIn5+XSv63+N3sHxKlCnWXb2DWZs0ykrfs/Aa/?= =?Windows-1252?Q?U8+0FaMd3zvS8begspsR0gNHp1RQEI9r+espUL0donUgWBMKhBsv3JUq?= =?Windows-1252?Q?Q1zH5kBbP+FNnlQ13XW36z1PxdYtcKeJ4ltqTi6lUeJcTp89xHqtd//W?= =?Windows-1252?Q?vpiwLrHBEq4J2Hn5+jNsCum7JeweOpHXTn2RowdXygFiuScJJDBrazR6?= =?Windows-1252?Q?znXtEf2Xe56uZ4jgjWiEJ4/YorNqs++be+1/n/qCRYS6nqrhegu2M2sA?= =?Windows-1252?Q?52Ot9fQ0N6qnRw/A+bI5Qz84VJeiXkjR1yk8GdDWJrd3lj80ytUbapaJ?= =?Windows-1252?Q?aNHoLPFtbdg9DOE3oRNzXwXkveONXCfe78VLBGmsaPF/uxxIzwnLF5TG?= =?Windows-1252?Q?c3QQizfFki6fBblx4aYronuprbF85HVno8vNdWSGEl21aBxJTFMiPCFT?= =?Windows-1252?Q?PtgSn8CeM0ZZQ8e+kSyTH66bgqIIW9xXNoGbNQFHlQ83I0xA2QDTE7vL?= =?Windows-1252?Q?aEpn4pu+NhDyY6izAQZpKnpjOYfuksNMQE9QvjSB1fAboq0t6tOxeRPu?= =?Windows-1252?Q?Ama7+2YVcgZTy4SIoc5kkbHVHsdv8luz934yrkxSN1BGkUpo+HpTLviL?= =?Windows-1252?Q?A5NHBW4Z9YU/rJf6KegEJIgiRI7YV0B9XgKfa27dMZiPkQyK2+4CcrXh?= =?Windows-1252?Q?p3RmhB98AhYrL4ViN/eOTZfSM9tovadb6KJHOqgJGEHOE/wJ9Hjx2tZw?= =?Windows-1252?Q?RsmbwTDwkdX6FkchNLgwGPFP9H8D78z0fuAuzSmnSLpC+jfwuuAZnMfm?= =?Windows-1252?Q?7TrrtPUQqlQNHHcJEenx5RPDKcB7ete0MBn1ZBbaVGiS+uG+9mFmmFs0?= =?Windows-1252?Q?AekpKs4aKV0WdHOTcNAJJ/OruUObJDH14I/KRzDsJiA=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 99b9dc4f-60ab-4ab7-6382-08d8f77e20b5 X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Apr 2021 15:27:15.8364 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: VpPXlLl531Dry+BTh6o4s81YJXWGxCjro9QEE/6C3jKyPnzMgloE9Aa5lPEfOVfoG1cKgxM3jcnhoxO0HamjCw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB3122 X-Rspamd-Queue-Id: 4FCyMc0zqnz3LLq X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 15:27:24 -0000 Well, I'm going to cheat and top post, since this is elated info. and not really part of the discussion... I've been testing network partitioning between a Linux client (5.2 kernel) and a FreeBSD-current NFS server. I have not gotten a solid hang, but I have had the Linux client doing "battle" with the FreeBSD server for several minutes after un-partitioning the connection. The battle basically consists of the Linux client sending an RST, followed by a SYN. The FreeBSD server ignores the RST and just replies with the same old ack. --> This varies from "just a SYN" that succeeds to 100+ cycles of the above over several minutes. I had thought that an RST was a "pretty heavy hammer", but FreeBSD seems pretty good at ignoring it. A full packet capture of one of these is in /home/rmacklem/linuxtofreenfs.p= cap in case anyone wants to look at it. Here's a tcpdump snippet of the interesting part (see the *** comments): 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [P.], seq 202585:202749, ack 212293, win 29128, options [nop,nop= ,TS val 2073636037 ecr 2671204825], length 164: NFS reply xid 613153685 rep= ly ok 160 getattr NON 4 ids 0/33554432 sz 0 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [.], ack 202749, win 501, options [nop,nop,TS val 2671204825 ecr= 2073636037], length 0 *** Network is now partitioned... 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop,T= S val 2671204927 ecr 2073636037], length 232: NFS request xid 629930901 228= getattr fh 0,1/53 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop,T= S val 2671205135 ecr 2073636037], length 232: NFS request xid 629930901 228= getattr fh 0,1/53 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop,T= S val 2671205343 ecr 2073636037], length 232: NFS request xid 629930901 228= getattr fh 0,1/53 *** Lots of lines snipped. 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.= home.rick, length 28 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.= home.rick, length 28 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.= home.rick, length 28 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.= home.rick, length 28 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.= home.rick, length 28 *** Network is now unpartitioned... 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 (ou= i Unknown), length 46 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val 2= 671421871 ecr 0,nop,wscale 7], length 0 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073855137 e= cr 2671204825], length 0 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [R], seq 964161458, win 0, length 0 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val 2= 671424943 ecr 0,nop,wscale 7], length 0 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073856161 e= cr 2671204825], length 0 *** This "battle" goes on for 223sec... I snipped out 13 cycles of this "Linux sends an RST, followed by SYN" "FreeBSD replies with same old ACK". In another test run I saw this cycle continue non-stop for several minutes. This time, the Linux client paused for a while (see ARPs below). 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [R], seq 964161458, win 0, length 0 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val 2= 671428975 ecr 0,nop,wscale 7], length 0 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073860193 e= cr 2671204825], length 0 *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.= home.rick, length 28 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 (ou= i Unknown), length 46 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1.25= 4, length 46 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1.25= 4, length 46 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS val= 2074046846 ecr 2671204825], length 0 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS val= 2074062406 ecr 2671204825], length 0 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [R.], seq 202750, ack 212293, win 0, options [nop,nop,TS val 207= 4077981 ecr 2671204825], length 0 *** FreeBSD finally acknowledges the RST 38sec after Linux sent the last of 13 (100+ for another test run). 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [S], seq 4247692373, win 64240, options [mss 1460,sackOK,TS val = 2671667055 ecr 0,nop,wscale 7], length 0 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [S.], seq 661237469, ack 4247692374, win 65535, options [mss 146= 0,nop,wscale 6,sackOK,TS val 2074098278 ecr 2671667055], length 0 *** Now back in business... 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [.], ack 1, win 502, options [nop,nop,TS val 2671667055 ecr 2074= 098278], length 0 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [P.], seq 1:233, ack 1, win 502, options [nop,nop,TS val 2671667= 056 ecr 2074098278], length 232: NFS request xid 629930901 228 getattr fh 0= ,1/53 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [P.], seq 233:505, ack 1, win 502, options [nop,nop,TS val 26716= 67056 ecr 2074098278], length 272: NFS request xid 697039765 132 getattr fh= 0,1/53 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [.], ack 505, win 29128, options [nop,nop,TS val 2074098279 ecr = 2671667056], length 0 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.n= fsd: Flags [P.], seq 505:641, ack 1, win 502, options [nop,nop,TS val 26716= 67056 ecr 2074098279], length 136: NFS request xid 730594197 132 getattr fh= 0,1/53 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-m= esh: Flags [P.], seq 1:49, ack 505, win 29128, options [nop,nop,TS val 2074= 098279 ecr 2671667056], length 48: NFS reply xid 697039765 reply ok 44 geta= ttr ERROR: unk 10063 This error 10063 after the partition heals is also "bad news". It indicates= the Session (which is supposed to maintain "exactly once" RPC semantics is broken). I'l= l admit I suspect a Linux client bug, but will be investigating further. So, hopefully TCP conversant folk can confirm if the above is correct behav= iour or if the RST should be ack'd sooner? I could also see this becoming a "forever" TCP battle for other versions of= Linux client. rick ________________________________________ From: Scheffenegger, Richard Sent: Sunday, April 4, 2021 7:50 AM To: Rick Macklem; tuexen@freebsd.org Cc: Youssef GHORBAL; freebsd-net@freebsd.org Subject: Re: NFS Mount Hangs CAUTION: This email originated from outside of the University of Guelph. Do= not click links or open attachments unless you recognize the sender and kn= ow the content is safe. If in doubt, forward suspicious emails to IThelp@uo= guelph.ca For what it=91s worth, suse found two bugs in the linux nfconntrack (statef= ul firewall), and pfifo-fast scheduler, which could conspire to make tcp se= ssions hang forever. One is a missed updaten when the c=F6ient is not using the noresvport moint= option, which makes tje firewall think rsts are illegal (and drop them); The fast scheduler can run into an issue if only a single packet should be = forwarded (note that this is not the default scheduler, but often recommend= ed for perf, as it runs lockless and lower cpu cost that pfq (default). If = no other/additional packet pushes out that last packet of a flow, it can be= come stuck forever... I can try getting the relevant bug info next week... ________________________________ Von: owner-freebsd-net@freebsd.org im Auftr= ag von Rick Macklem Gesendet: Friday, April 2, 2021 11:31:01 PM An: tuexen@freebsd.org Cc: Youssef GHORBAL ; freebsd-net@freebsd.org <= freebsd-net@freebsd.org> Betreff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or o= pen attachments unless you recognize the sender and know the content is saf= e. tuexen@freebsd.org wrote: >> On 2. Apr 2021, at 02:07, Rick Macklem wrote: >> >> I hope you don't mind a top post... >> I've been testing network partitioning between the only Linux client >> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch >> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >> applied to it. >> >> I'm not enough of a TCP guy to know if this is useful, but here's what >> I see... >> >> While partitioned: >> On the FreeBSD server end, the socket either goes to CLOSED during >> the network partition or stays ESTABLISHED. >If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also >sent a FIN, but you never called close() on the socket. >If the socket stays in ESTABLISHED, there is no communication ongoing, >I guess, and therefore the server does not even detect that the peer >is not reachable. >> On the Linux end, the socket seems to remain ESTABLISHED for a >> little while, and then disappears. >So how does Linux detect the peer is not reachable? Well, here's what I see in a packet capture in the Linux client once I partition it (just unplug the net cable): - lots of retransmits of the same segment (with ACK) for 54sec - then only ARP queries Once I plug the net cable back in: - ARP works - one more retransmit of the same segement - receives RST from FreeBSD ** So, is this now a "new" TCP connection, despite using the same port#. --> It matters for NFS, since "new connection" implies "must retry all outstanding RPCs". - sends SYN - receives SYN, ACK from FreeBSD --> connection starts working again Always uses same port#. On the FreeBSD server end: - receives the last retransmit of the segment (with ACK) - sends RST - receives SYN - sends SYN, ACK I thought that there was no RST in the capture I looked at yesterday, so I'm not sure if FreeBSD always sends an RST, but the Linux client behaviour was the same. (Sent a SYN, etc). The socket disappears from the Linux "netstat -a" and I suspect that happens after about 54sec, but I am not sure about the timing. >> >> After unpartitioning: >> On the FreeBSD server end, you get another socket showing up at >> the same port# >> Active Internet connections (including servers) >> Proto Recv-Q Send-Q Local Address Foreign Address (state= ) >> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 ESTABL= ISHED >> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 CLOSED >> >> The Linux client shows the same connection ESTABLISHED. But disappears from "netstat -a" for a while during the partitioning. >> (The mount sometimes reports an error. I haven't looked at packet >> traces to see if it retries RPCs or why the errors occur.) I have now done so, as above. >> --> However I never get hangs. >> Sometimes it goes to SYN_SENT for a while and the FreeBSD server >> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >> mount starts working again. >> >> The most obvious thing is that the Linux client always keeps using >> the same port#. (The FreeBSD client will use a different port# when >> it does a TCP reconnect after no response from the NFS server for >> a little while.) >> >> What do those TCP conversant think? >I guess you are you are never calling close() on the socket, for with >the connection state is CLOSED. Ok, that makes sense. For this case the Linux client has not done a BindConnectionToSession to re-assign the back channel. I'll have to bug them about this. However, I'll bet they'll answer that I have to tell them the back channel needs re-assignment or something like that. I am pretty certain they are broken, in that the client needs to retry all outstanding RPCs. For others, here's the long winded version of this that I just put on the phabricator review: In the server side kernel RPC, the socket (struct socket *) is in a structure called SVCXPRT (normally pointed to by "xprt"). These structures a ref counted and the soclose() is done when the ref. cnt goes to zero. My understanding is that "struct socket *" is free'd by soclose() so this cannot be done before the xprt ref. cnt goes to zero. For NFSv4.1/4.2 there is something called a back channel which means that a "xprt" is used for server->client RPCs, although the TCP connection is established by the client to the server. --> This back channel holds a ref cnt on "xprt" until the client re-assigns it to a different TCP connection via an operation called BindConnectionToSession and the Linux client is not doing this soon enough, it appears. So, the soclose() is delayed, which is why I think the TCP connection gets stuck in CLOSE_WAIT and that is why I've added the soshutdown(..SHUT_WR) calls, which can happen before the client gets around to re-assigning the back channel. Thanks for your help with this Michael, rick Best regards Michael > > rick > ps: I can capture packets while doing this, if anyone has a use > for them. > > > > > > > ________________________________________ > From: owner-freebsd-net@freebsd.org on be= half of Youssef GHORBAL > Sent: Saturday, March 27, 2021 6:57 PM > To: Jason Breitman > Cc: Rick Macklem; freebsd-net@freebsd.org > Subject: Re: NFS Mount Hangs > > CAUTION: This email originated from outside of the University of Guelph. = Do not click links or open attachments unless you recognize the sender and = know the content is safe. If in doubt, forward suspicious emails to IThelp@= uoguelph.ca > > > > > On 27 Mar 2021, at 13:20, Jason Breitman > wrote: > > The issue happened again so we can say that disabling TSO and LRO on the = NIC did not resolve this issue. > # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso > # ifconfig lagg0 > lagg0: flags=3D8943 metri= c 0 mtu 1500 > options=3D8100b8 > > We can also say that the sysctl settings did not resolve this issue. > > # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 > net.inet.tcp.fast_finwait2_recycle: 0 -> 1 > > # sysctl net.inet.tcp.finwait2_timeout=3D1000 > net.inet.tcp.finwait2_timeout: 60000 -> 1000 > > I don=92t think those will do anything in your case since the FIN_WAIT2 a= re on the client side and those sysctls are for BSD. > By the way it seems that Linux recycles automatically TCP sessions in FIN= _WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) > > tcp_fin_timeout (integer; default: 60; since Linux 2.2) > This specifies how many seconds to wait for a final FIN > packet before the socket is forcibly closed. This is > strictly a violation of the TCP specification, but > required to prevent denial-of-service attacks. In Linux > 2.2, the default value was 180. > > So I don=92t get why it stucks in the FIN_WAIT2 state anyway. > > You really need to have a packet capture during the outage (client and se= rver side) so you=92ll get over the wire chat and start speculating from th= ere. > No need to capture the beginning of the outage for now. All you have to d= o, is run a tcpdump for 10 minutes or so when you notice a client stuck. > > * I have not rebooted the NFS Server nor have I restarted nfsd, but do no= t believe that is required as these settings are at the TCP level and I wou= ld expect new sessions to use the updated settings. > > The issue occurred after 5 days following a reboot of the client machines= . > I ran the capture information again to make use of the situation. > > #!/bin/sh > > while true > do > /bin/date >> /tmp/nfs-hang.log > /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log > /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log > /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log > /bin/sleep 60 > done > > > On the NFS Server > Active Internet connections (including servers) > Proto Recv-Q Send-Q Local Address Foreign Address (state) > tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 C= LOSE_WAIT > > On the NFS Client > tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2 > > > > You had also asked for the output below. > > # nfsstat -E -s > BackChannelCtBindConnToSes > 0 0 > > # sysctl vfs.nfsd.request_space_throttle_count > vfs.nfsd.request_space_throttle_count: 0 > > I see that you are testing a patch and I look forward to seeing the resul= ts. > > > Jason Breitman > > > On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote: > > Youssef GHORBAL > wrote: >> Hi Jason, >> >>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote: >>> >>> Please review the details below and let me know if there is a setting t= hat I should apply to my FreeBSD NFS Server or if there is a bug fix that I= can apply to resolve my issue. >>> I shared this information with the linux-nfs mailing list and they beli= eve the issue is on the server side. >>> >>> Issue >>> NFSv4 mounts periodically hang on the NFS Client. >>> >>> During this time, it is possible to manually mount from another NFS Ser= ver on the NFS Client having issues. >>> Also, other NFS Clients are successfully mounting from the NFS Server i= n question. >>> Rebooting the NFS Client appears to be the only solution. >> >> I had experienced a similar weird situation with periodically stuck Linu= x NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based but the= y seem to have there >own nfsd) > Yes, my understanding is that Isilon uses a proprietary user space nfsd a= nd > not the kernel based RPC and nfsd in FreeBSD. > >> We=92ve had better luck and we did manage to have packet captures on bot= h sides >during the issue. The gist of it goes like follows: >> >> - Data flows correctly between SERVER and the CLIENT >> - At some point SERVER starts decreasing it's TCP Receive Window until i= t reachs 0 >> - The client (eager to send data) can only ack data sent by SERVER. >> - When SERVER was done sending data, the client starts sending TCP Windo= w >Probes hoping that the TCP Window opens again so he can flush its buffer= s. >> - SERVER responds with a TCP Zero Window to those probes. > Having the window size drop to zero is not necessarily incorrect. > If the server is overloaded (has a backlog of NFS requests), it can stop = doing > soreceive() on the socket (so the socket rcv buffer can fill up and the T= CP window > closes). This results in "backpressure" to stop the NFS client from flood= ing the > NFS server with requests. > --> However, once the backlog is handled, the nfsd should start to sorece= ive() > again and this shouls cause the window to open back up. > --> Maybe this is broken in the socket/TCP code. I quickly got lost in > tcp_output() when it decides what to do about the rcvwin. > >> - After 6 minutes (the NFS server default Idle timeout) SERVER racefully= closes the >TCP connection sending a FIN Packet (and still a TCP Window 0) > This probably does not happen for Jason's case, since the 6minute timeout > is disabled when the TCP connection is assigned as a backchannel (most li= kely > the case for NFSv4.1). > >> - CLIENT ACK that FIN. >> - SERVER goes in FIN_WAIT_2 state >> - CLIENT closes its half part part of the socket and goes in LAST_ACK st= ate. >> - FIN is never sent by the client since there still data in its SendQ an= d receiver TCP >Window is still 0. At this stage the client starts sending = TCP Window Probes again >and again hoping that the server opens its TCP Win= dow so it can flush it's buffers >and terminate its side of the socket. >> - SERVER keeps responding with a TCP Zero Window to those probes. >> =3D> The last two steps goes on and on for hours/days freezing the NFS m= ount bound >to that TCP session. >> >> If we had a situation where CLIENT was responsible for closing the TCP W= indow (and >initiating the TCP FIN first) and server wanting to send data w= e=92ll end up in the same >state as you I think. >> >> We=92ve never had the root cause of why the SERVER decided to close the = TCP >Window and no more acccept data, the fix on the Isilon part was to rec= ycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwait2_= recycle=3D1 & >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recyc= led and at the next >occurence of CLIENT TCP Window probe, SERVER sends a R= ST, triggering the >teardown of the session on the client side, a new TCP h= andchake, etc and traffic >flows again (NFS starts responding) >> >> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 was = >implemented on the Isilon side) we=92ve added a check script on the client= that detects >LAST_ACK sockets on the client and through iptables rule enf= orces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr --sp= ort $local_port -j REJECT >--reject-with tcp-reset (the script removes this= iptables rule as soon as the LAST_ACK >disappears) >> >> The bottom line would be to have a packet capture during the outage (cli= ent and/or >server side), it will show you at least the shape of the TCP ex= change when NFS is >stuck. > Interesting story and good work w.r.t. sluething, Youssef, thanks. > > I looked at Jason's log and it shows everything is ok w.r.t the nfsd thre= ads. > (They're just waiting for RPC requests.) > However, I do now think I know why the soclose() does not happen. > When the TCP connection is assigned as a backchannel, that takes a refere= nce > cnt on the structure. This refcnt won't be released until the connection = is > replaced by a BindConnectiotoSession operation from the client. But that = won't > happen until the client creates a new TCP connection. > --> No refcnt release-->no refcnt of 0-->no soclose(). > > I've created the attached patch (completely different from the previous o= ne) > that adds soshutdown(SHUT_WR) calls in the three places where the TCP > connection is going away. This seems to get it past CLOSE_WAIT without a > soclose(). > --> I know you are not comfortable with patching your server, but I do th= ink > this change will get the socket shutdown to complete. > > There are a couple more things you can check on the server... > # nfsstat -E -s > --> Look for the count under "BindConnToSes". > --> If non-zero, backchannels have been assigned > # sysctl -a | fgrep request_space_throttle_count > --> If non-zero, the server has been overloaded at some point. > > I think the attached patch might work around the problem. > The code that should open up the receive window needs to be checked. > I am also looking at enabling the 6minute timeout when a backchannel is > assigned. > > rick > > Youssef > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/fr= eebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZci= Lj4ObR_drJNv5yryaERfIbhKR2d$ > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Sun Apr 4 16:41:58 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 1DDB25AC341 for ; Sun, 4 Apr 2021 16:41:58 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FD01d6Vsmz3QH7 for ; Sun, 4 Apr 2021 16:41:57 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:98a3:657e:a126:66e5] (unknown [IPv6:2a02:8109:1140:c3d:98a3:657e:a126:66e5]) (Authenticated sender: macmic) by drew.franken.de (Postfix) with ESMTPSA id B4DA9769BA0A0; Sun, 4 Apr 2021 18:41:46 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs From: tuexen@freebsd.org In-Reply-To: Date: Sun, 4 Apr 2021 18:41:46 +0200 Cc: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> To: Rick Macklem X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FD01d6Vsmz3QH7 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 16:41:58 -0000 > On 4. Apr 2021, at 17:27, Rick Macklem wrote: >=20 > Well, I'm going to cheat and top post, since this is elated info. and > not really part of the discussion... >=20 > I've been testing network partitioning between a Linux client (5.2 = kernel) > and a FreeBSD-current NFS server. I have not gotten a solid hang, but > I have had the Linux client doing "battle" with the FreeBSD server for > several minutes after un-partitioning the connection. >=20 > The battle basically consists of the Linux client sending an RST, = followed > by a SYN. > The FreeBSD server ignores the RST and just replies with the same old = ack. > --> This varies from "just a SYN" that succeeds to 100+ cycles of the = above > over several minutes. >=20 > I had thought that an RST was a "pretty heavy hammer", but FreeBSD = seems > pretty good at ignoring it. >=20 > A full packet capture of one of these is in = /home/rmacklem/linuxtofreenfs.pcap > in case anyone wants to look at it. On freefall? I would like to take a look at it... Best regards Michael >=20 > Here's a tcpdump snippet of the interesting part (see the *** = comments): > 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 202585:202749, ack = 212293, win 29128, options [nop,nop,TS val 2073636037 ecr 2671204825], = length 164: NFS reply xid 613153685 reply ok 160 getattr NON 4 ids = 0/33554432 sz 0 > 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 202749, win 501, options = [nop,nop,TS val 2671204825 ecr 2073636037], length 0 > *** Network is now partitioned... >=20 > 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671204927 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 > 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205135 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 > 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205343 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 > *** Lots of lines snipped. >=20 >=20 > 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 > 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 > 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 > 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 > 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 > *** Network is now unpartitioned... >=20 > 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 > 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671421871 ecr 0,nop,wscale 7], length 0 > 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073855137 ecr 2671204825], length 0 > 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 > 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671424943 ecr 0,nop,wscale 7], length 0 > 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073856161 ecr 2671204825], length 0 > *** This "battle" goes on for 223sec... > I snipped out 13 cycles of this "Linux sends an RST, followed by = SYN" > "FreeBSD replies with same old ACK". In another test run I saw this > cycle continue non-stop for several minutes. This time, the Linux > client paused for a while (see ARPs below). >=20 > 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 > 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671428975 ecr 0,nop,wscale 7], length 0 > 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073860193 ecr 2671204825], length 0 > *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. >=20 > 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 > 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 > 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 > 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 > 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074046846 ecr 2671204825], length 0 > 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074062406 ecr 2671204825], length 0 > 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [R.], seq 202750, ack 212293, win = 0, options [nop,nop,TS val 2074077981 ecr 2671204825], length 0 > *** FreeBSD finally acknowledges the RST 38sec after Linux sent the = last > of 13 (100+ for another test run). >=20 > 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 4247692373, win 64240, options = [mss 1460,sackOK,TS val 2671667055 ecr 0,nop,wscale 7], length 0 > 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [S.], seq 661237469, ack = 4247692374, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val = 2074098278 ecr 2671667055], length 0 > *** Now back in business... >=20 > 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 1, win 502, options = [nop,nop,TS val 2671667055 ecr 2074098278], length 0 > 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 232: NFS = request xid 629930901 228 getattr fh 0,1/53 > 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 272: NFS = request xid 697039765 132 getattr fh 0,1/53 > 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 505, win 29128, options = [nop,nop,TS val 2074098279 ecr 2671667056], length 0 > 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098279], length 136: NFS = request xid 730594197 132 getattr fh 0,1/53 > 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 1:49, ack 505, win = 29128, options [nop,nop,TS val 2074098279 ecr 2671667056], length 48: = NFS reply xid 697039765 reply ok 44 getattr ERROR: unk 10063 >=20 > This error 10063 after the partition heals is also "bad news". It = indicates the Session > (which is supposed to maintain "exactly once" RPC semantics is = broken). I'll admit I > suspect a Linux client bug, but will be investigating further. >=20 > So, hopefully TCP conversant folk can confirm if the above is correct = behaviour > or if the RST should be ack'd sooner? >=20 > I could also see this becoming a "forever" TCP battle for other = versions of Linux client. >=20 > rick >=20 >=20 > ________________________________________ > From: Scheffenegger, Richard > Sent: Sunday, April 4, 2021 7:50 AM > To: Rick Macklem; tuexen@freebsd.org > Cc: Youssef GHORBAL; freebsd-net@freebsd.org > Subject: Re: NFS Mount Hangs >=20 > CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >=20 >=20 > For what it=E2=80=98s worth, suse found two bugs in the linux = nfconntrack (stateful firewall), and pfifo-fast scheduler, which could = conspire to make tcp sessions hang forever. >=20 > One is a missed updaten when the c=C3=B6ient is not using the = noresvport moint option, which makes tje firewall think rsts are illegal = (and drop them); >=20 > The fast scheduler can run into an issue if only a single packet = should be forwarded (note that this is not the default scheduler, but = often recommended for perf, as it runs lockless and lower cpu cost that = pfq (default). If no other/additional packet pushes out that last packet = of a flow, it can become stuck forever... >=20 > I can try getting the relevant bug info next week... >=20 > ________________________________ > Von: owner-freebsd-net@freebsd.org im = Auftrag von Rick Macklem > Gesendet: Friday, April 2, 2021 11:31:01 PM > An: tuexen@freebsd.org > Cc: Youssef GHORBAL ; = freebsd-net@freebsd.org > Betreff: Re: NFS Mount Hangs >=20 > NetApp Security WARNING: This is an external email. Do not click links = or open attachments unless you recognize the sender and know the content = is safe. >=20 >=20 >=20 >=20 > tuexen@freebsd.org wrote: >>> On 2. Apr 2021, at 02:07, Rick Macklem wrote: >>>=20 >>> I hope you don't mind a top post... >>> I've been testing network partitioning between the only Linux client >>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch >>> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >>> applied to it. >>>=20 >>> I'm not enough of a TCP guy to know if this is useful, but here's = what >>> I see... >>>=20 >>> While partitioned: >>> On the FreeBSD server end, the socket either goes to CLOSED during >>> the network partition or stays ESTABLISHED. >> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also >> sent a FIN, but you never called close() on the socket. >> If the socket stays in ESTABLISHED, there is no communication = ongoing, >> I guess, and therefore the server does not even detect that the peer >> is not reachable. >>> On the Linux end, the socket seems to remain ESTABLISHED for a >>> little while, and then disappears. >> So how does Linux detect the peer is not reachable? > Well, here's what I see in a packet capture in the Linux client once > I partition it (just unplug the net cable): > - lots of retransmits of the same segment (with ACK) for 54sec > - then only ARP queries >=20 > Once I plug the net cable back in: > - ARP works > - one more retransmit of the same segement > - receives RST from FreeBSD > ** So, is this now a "new" TCP connection, despite > using the same port#. > --> It matters for NFS, since "new connection" > implies "must retry all outstanding RPCs". > - sends SYN > - receives SYN, ACK from FreeBSD > --> connection starts working again > Always uses same port#. >=20 > On the FreeBSD server end: > - receives the last retransmit of the segment (with ACK) > - sends RST > - receives SYN > - sends SYN, ACK >=20 > I thought that there was no RST in the capture I looked at > yesterday, so I'm not sure if FreeBSD always sends an RST, > but the Linux client behaviour was the same. (Sent a SYN, etc). > The socket disappears from the Linux "netstat -a" and I > suspect that happens after about 54sec, but I am not sure > about the timing. >=20 >>>=20 >>> After unpartitioning: >>> On the FreeBSD server end, you get another socket showing up at >>> the same port# >>> Active Internet connections (including servers) >>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = ESTABLISHED >>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = CLOSED >>>=20 >>> The Linux client shows the same connection ESTABLISHED. > But disappears from "netstat -a" for a while during the partitioning. >=20 >>> (The mount sometimes reports an error. I haven't looked at packet >>> traces to see if it retries RPCs or why the errors occur.) > I have now done so, as above. >=20 >>> --> However I never get hangs. >>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server >>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >>> mount starts working again. >>>=20 >>> The most obvious thing is that the Linux client always keeps using >>> the same port#. (The FreeBSD client will use a different port# when >>> it does a TCP reconnect after no response from the NFS server for >>> a little while.) >>>=20 >>> What do those TCP conversant think? >> I guess you are you are never calling close() on the socket, for with >> the connection state is CLOSED. > Ok, that makes sense. For this case the Linux client has not done a > BindConnectionToSession to re-assign the back channel. > I'll have to bug them about this. However, I'll bet they'll answer > that I have to tell them the back channel needs re-assignment > or something like that. >=20 > I am pretty certain they are broken, in that the client needs to > retry all outstanding RPCs. >=20 > For others, here's the long winded version of this that I just > put on the phabricator review: > In the server side kernel RPC, the socket (struct socket *) is in a > structure called SVCXPRT (normally pointed to by "xprt"). > These structures a ref counted and the soclose() is done > when the ref. cnt goes to zero. My understanding is that > "struct socket *" is free'd by soclose() so this cannot be done > before the xprt ref. cnt goes to zero. >=20 > For NFSv4.1/4.2 there is something called a back channel > which means that a "xprt" is used for server->client RPCs, > although the TCP connection is established by the client > to the server. > --> This back channel holds a ref cnt on "xprt" until the >=20 > client re-assigns it to a different TCP connection > via an operation called BindConnectionToSession > and the Linux client is not doing this soon enough, > it appears. >=20 > So, the soclose() is delayed, which is why I think the > TCP connection gets stuck in CLOSE_WAIT and that is > why I've added the soshutdown(..SHUT_WR) calls, > which can happen before the client gets around to > re-assigning the back channel. >=20 > Thanks for your help with this Michael, rick >=20 > Best regards > Michael >>=20 >> rick >> ps: I can capture packets while doing this, if anyone has a use >> for them. >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >> ________________________________________ >> From: owner-freebsd-net@freebsd.org = on behalf of Youssef GHORBAL >> Sent: Saturday, March 27, 2021 6:57 PM >> To: Jason Breitman >> Cc: Rick Macklem; freebsd-net@freebsd.org >> Subject: Re: NFS Mount Hangs >>=20 >> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>=20 >>=20 >>=20 >>=20 >> On 27 Mar 2021, at 13:20, Jason Breitman = > = wrote: >>=20 >> The issue happened again so we can say that disabling TSO and LRO on = the NIC did not resolve this issue. >> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso = -vlanhwtso >> # ifconfig lagg0 >> lagg0: flags=3D8943 = metric 0 mtu 1500 >> = options=3D8100b8 >>=20 >> We can also say that the sysctl settings did not resolve this issue. >>=20 >> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 >> net.inet.tcp.fast_finwait2_recycle: 0 -> 1 >>=20 >> # sysctl net.inet.tcp.finwait2_timeout=3D1000 >> net.inet.tcp.finwait2_timeout: 60000 -> 1000 >>=20 >> I don=E2=80=99t think those will do anything in your case since the = FIN_WAIT2 are on the client side and those sysctls are for BSD. >> By the way it seems that Linux recycles automatically TCP sessions in = FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) >>=20 >> tcp_fin_timeout (integer; default: 60; since Linux 2.2) >> This specifies how many seconds to wait for a final FIN >> packet before the socket is forcibly closed. This is >> strictly a violation of the TCP specification, but >> required to prevent denial-of-service attacks. In Linux >> 2.2, the default value was 180. >>=20 >> So I don=E2=80=99t get why it stucks in the FIN_WAIT2 state anyway. >>=20 >> You really need to have a packet capture during the outage (client = and server side) so you=E2=80=99ll get over the wire chat and start = speculating from there. >> No need to capture the beginning of the outage for now. All you have = to do, is run a tcpdump for 10 minutes or so when you notice a client = stuck. >>=20 >> * I have not rebooted the NFS Server nor have I restarted nfsd, but = do not believe that is required as these settings are at the TCP level = and I would expect new sessions to use the updated settings. >>=20 >> The issue occurred after 5 days following a reboot of the client = machines. >> I ran the capture information again to make use of the situation. >>=20 >> #!/bin/sh >>=20 >> while true >> do >> /bin/date >> /tmp/nfs-hang.log >> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log >> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log >> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log >> /bin/sleep 60 >> done >>=20 >>=20 >> On the NFS Server >> Active Internet connections (including servers) >> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT >>=20 >> On the NFS Client >> tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2 >>=20 >>=20 >>=20 >> You had also asked for the output below. >>=20 >> # nfsstat -E -s >> BackChannelCtBindConnToSes >> 0 0 >>=20 >> # sysctl vfs.nfsd.request_space_throttle_count >> vfs.nfsd.request_space_throttle_count: 0 >>=20 >> I see that you are testing a patch and I look forward to seeing the = results. >>=20 >>=20 >> Jason Breitman >>=20 >>=20 >> On Mar 21, 2021, at 6:21 PM, Rick Macklem = > wrote: >>=20 >> Youssef GHORBAL = > wrote: >>> Hi Jason, >>>=20 >>>> On 17 Mar 2021, at 18:17, Jason Breitman = > = wrote: >>>>=20 >>>> Please review the details below and let me know if there is a = setting that I should apply to my FreeBSD NFS Server or if there is a = bug fix that I can apply to resolve my issue. >>>> I shared this information with the linux-nfs mailing list and they = believe the issue is on the server side. >>>>=20 >>>> Issue >>>> NFSv4 mounts periodically hang on the NFS Client. >>>>=20 >>>> During this time, it is possible to manually mount from another NFS = Server on the NFS Client having issues. >>>> Also, other NFS Clients are successfully mounting from the NFS = Server in question. >>>> Rebooting the NFS Client appears to be the only solution. >>>=20 >>> I had experienced a similar weird situation with periodically stuck = Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based = but they seem to have there >own nfsd) >> Yes, my understanding is that Isilon uses a proprietary user space = nfsd and >> not the kernel based RPC and nfsd in FreeBSD. >>=20 >>> We=E2=80=99ve had better luck and we did manage to have packet = captures on both sides >during the issue. The gist of it goes like = follows: >>>=20 >>> - Data flows correctly between SERVER and the CLIENT >>> - At some point SERVER starts decreasing it's TCP Receive Window = until it reachs 0 >>> - The client (eager to send data) can only ack data sent by SERVER. >>> - When SERVER was done sending data, the client starts sending TCP = Window >Probes hoping that the TCP Window opens again so he can flush = its buffers. >>> - SERVER responds with a TCP Zero Window to those probes. >> Having the window size drop to zero is not necessarily incorrect. >> If the server is overloaded (has a backlog of NFS requests), it can = stop doing >> soreceive() on the socket (so the socket rcv buffer can fill up and = the TCP window >> closes). This results in "backpressure" to stop the NFS client from = flooding the >> NFS server with requests. >> --> However, once the backlog is handled, the nfsd should start to = soreceive() >> again and this shouls cause the window to open back up. >> --> Maybe this is broken in the socket/TCP code. I quickly got lost = in >> tcp_output() when it decides what to do about the rcvwin. >>=20 >>> - After 6 minutes (the NFS server default Idle timeout) SERVER = racefully closes the >TCP connection sending a FIN Packet (and still a = TCP Window 0) >> This probably does not happen for Jason's case, since the 6minute = timeout >> is disabled when the TCP connection is assigned as a backchannel = (most likely >> the case for NFSv4.1). >>=20 >>> - CLIENT ACK that FIN. >>> - SERVER goes in FIN_WAIT_2 state >>> - CLIENT closes its half part part of the socket and goes in = LAST_ACK state. >>> - FIN is never sent by the client since there still data in its = SendQ and receiver TCP >Window is still 0. At this stage the client = starts sending TCP Window Probes again >and again hoping that the server = opens its TCP Window so it can flush it's buffers >and terminate its = side of the socket. >>> - SERVER keeps responding with a TCP Zero Window to those probes. >>> =3D> The last two steps goes on and on for hours/days freezing the = NFS mount bound >to that TCP session. >>>=20 >>> If we had a situation where CLIENT was responsible for closing the = TCP Window (and >initiating the TCP FIN first) and server wanting to = send data we=E2=80=99ll end up in the same >state as you I think. >>>=20 >>> We=E2=80=99ve never had the root cause of why the SERVER decided to = close the TCP >Window and no more acccept data, the fix on the Isilon = part was to recycle more >aggressively the FIN_WAIT_2 sockets = (net.inet.tcp.fast_finwait2_recycle=3D1 & = >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recycled and at = the next >occurence of CLIENT TCP Window probe, SERVER sends a RST, = triggering the >teardown of the session on the client side, a new TCP = handchake, etc and traffic >flows again (NFS starts responding) >>>=20 >>> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 = was >implemented on the Isilon side) we=E2=80=99ve added a check script = on the client that detects >LAST_ACK sockets on the client and through = iptables rule enforces a TCP RST, >Something like: -A OUTPUT -p tcp -d = $nfs_server_addr --sport $local_port -j REJECT >--reject-with tcp-reset = (the script removes this iptables rule as soon as the LAST_ACK = >disappears) >>>=20 >>> The bottom line would be to have a packet capture during the outage = (client and/or >server side), it will show you at least the shape of the = TCP exchange when NFS is >stuck. >> Interesting story and good work w.r.t. sluething, Youssef, thanks. >>=20 >> I looked at Jason's log and it shows everything is ok w.r.t the nfsd = threads. >> (They're just waiting for RPC requests.) >> However, I do now think I know why the soclose() does not happen. >> When the TCP connection is assigned as a backchannel, that takes a = reference >> cnt on the structure. This refcnt won't be released until the = connection is >> replaced by a BindConnectiotoSession operation from the client. But = that won't >> happen until the client creates a new TCP connection. >> --> No refcnt release-->no refcnt of 0-->no soclose(). >>=20 >> I've created the attached patch (completely different from the = previous one) >> that adds soshutdown(SHUT_WR) calls in the three places where the TCP >> connection is going away. This seems to get it past CLOSE_WAIT = without a >> soclose(). >> --> I know you are not comfortable with patching your server, but I = do think >> this change will get the socket shutdown to complete. >>=20 >> There are a couple more things you can check on the server... >> # nfsstat -E -s >> --> Look for the count under "BindConnToSes". >> --> If non-zero, backchannels have been assigned >> # sysctl -a | fgrep request_space_throttle_count >> --> If non-zero, the server has been overloaded at some point. >>=20 >> I think the attached patch might work around the problem. >> The code that should open up the receive window needs to be checked. >> I am also looking at enabling the 6minute timeout when a backchannel = is >> assigned. >>=20 >> rick >>=20 >> Youssef >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> = https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/fre= ebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZci= Lj4ObR_drJNv5yryaERfIbhKR2d$ >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >> >>=20 >> >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Sun Apr 4 17:27:33 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5BEF25AD078 for ; Sun, 4 Apr 2021 17:27:33 +0000 (UTC) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: from gndrsh.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4FD12D4vMwz3hgM; Sun, 4 Apr 2021 17:27:31 +0000 (UTC) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: from gndrsh.dnsmgr.net (localhost [127.0.0.1]) by gndrsh.dnsmgr.net (8.13.3/8.13.3) with ESMTP id 134HRTjO097116; Sun, 4 Apr 2021 10:27:29 -0700 (PDT) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: (from freebsd-rwg@localhost) by gndrsh.dnsmgr.net (8.13.3/8.13.3/Submit) id 134HRTbA097115; Sun, 4 Apr 2021 10:27:29 -0700 (PDT) (envelope-from freebsd-rwg) From: "Rodney W. Grimes" Message-Id: <202104041727.134HRTbA097115@gndrsh.dnsmgr.net> Subject: Re: NFS Mount Hangs In-Reply-To: To: Rick Macklem Date: Sun, 4 Apr 2021 10:27:28 -0700 (PDT) CC: "Scheffenegger, Richard" , "tuexen@freebsd.org" , Youssef GHORBAL , "freebsd-net@freebsd.org" X-Mailer: ELM [version 2.4ME+ PL121h (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 4FD12D4vMwz3hgM X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 17:27:33 -0000 And I'll follow the lead to top post, as I have been quietly following this thread, trying to only add when I think I have relevant input, and I think I do on a small point... Rick, Your "unplugging" a cable to simulate network partitioning, in my experience this is a bad way to do that, as the host gets to see the link layer go down and knows it can not send. I am actually puzzled that you see arp packets, but I guess those are getting picked off before the interface layer silently tosses them on the ground. IIRC due to this loss of link layer you may be masking some things that would occur in other situations as often an error is returned to the application layer. IE the ONLY packet your likely to see into an unplugged cable is "arp". I can suggest other means to partition, such as configuring a switch port in and out of the correct LAN/VLAN, a physical switch in the TX pair to open it, but leave RX pair intact so carrier is not lost. Both of these simulate partitioning that is more realistic, AND does not have the side effect of allowing upper layers to eat the packets before bpf can grab them, or be told that partitioning has occured. Another side effect of unplugging a cable is that a host should immediately invalidate all ARP entries on that interface... hence your getting into an arp who has situation that should not even start for 5 minutes in the other failure modes. Regards, Rod > Well, I'm going to cheat and top post, since this is elated info. and > not really part of the discussion... > > I've been testing network partitioning between a Linux client (5.2 kernel) > and a FreeBSD-current NFS server. I have not gotten a solid hang, but > I have had the Linux client doing "battle" with the FreeBSD server for > several minutes after un-partitioning the connection. > > The battle basically consists of the Linux client sending an RST, followed > by a SYN. > The FreeBSD server ignores the RST and just replies with the same old ack. > --> This varies from "just a SYN" that succeeds to 100+ cycles of the above > over several minutes. > > I had thought that an RST was a "pretty heavy hammer", but FreeBSD seems > pretty good at ignoring it. > > A full packet capture of one of these is in /home/rmacklem/linuxtofreenfs.pcap > in case anyone wants to look at it. > > Here's a tcpdump snippet of the interesting part (see the *** comments): > 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 202585:202749, ack 212293, win 29128, options [nop,nop,TS val 2073636037 ecr 2671204825], length 164: NFS reply xid 613153685 reply ok 160 getattr NON 4 ids 0/33554432 sz 0 > 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [.], ack 202749, win 501, options [nop,nop,TS val 2671204825 ecr 2073636037], length 0 > *** Network is now partitioned... > > 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop,TS val 2671204927 ecr 2073636037], length 232: NFS request xid 629930901 228 getattr fh 0,1/53 > 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop,TS val 2671205135 ecr 2073636037], length 232: NFS request xid 629930901 228 getattr fh 0,1/53 > 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop,TS val 2671205343 ecr 2073636037], length 232: NFS request xid 629930901 228 getattr fh 0,1/53 > *** Lots of lines snipped. > > > 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.home.rick, length 28 > 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.home.rick, length 28 > 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.home.rick, length 28 > 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.home.rick, length 28 > 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.home.rick, length 28 > *** Network is now unpartitioned... > > 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 (oui Unknown), length 46 > 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val 2671421871 ecr 0,nop,wscale 7], length 0 > 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073855137 ecr 2671204825], length 0 > 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 > 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val 2671424943 ecr 0,nop,wscale 7], length 0 > 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073856161 ecr 2671204825], length 0 > *** This "battle" goes on for 223sec... > I snipped out 13 cycles of this "Linux sends an RST, followed by SYN" > "FreeBSD replies with same old ACK". In another test run I saw this > cycle continue non-stop for several minutes. This time, the Linux > client paused for a while (see ARPs below). > > 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 > 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val 2671428975 ecr 0,nop,wscale 7], length 0 > 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073860193 ecr 2671204825], length 0 > *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. > > 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linux.home.rick, length 28 > 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 (oui Unknown), length 46 > 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1.254, length 46 > 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1.254, length 46 > 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS val 2074046846 ecr 2671204825], length 0 > 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS val 2074062406 ecr 2671204825], length 0 > 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [R.], seq 202750, ack 212293, win 0, options [nop,nop,TS val 2074077981 ecr 2671204825], length 0 > *** FreeBSD finally acknowledges the RST 38sec after Linux sent the last > of 13 (100+ for another test run). > > 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [S], seq 4247692373, win 64240, options [mss 1460,sackOK,TS val 2671667055 ecr 0,nop,wscale 7], length 0 > 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [S.], seq 661237469, ack 4247692374, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 2074098278 ecr 2671667055], length 0 > *** Now back in business... > > 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [.], ack 1, win 502, options [nop,nop,TS val 2671667055 ecr 2074098278], length 0 > 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, options [nop,nop,TS val 2671667056 ecr 2074098278], length 232: NFS request xid 629930901 228 getattr fh 0,1/53 > 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, options [nop,nop,TS val 2671667056 ecr 2074098278], length 272: NFS request xid 697039765 132 getattr fh 0,1/53 > 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 505, win 29128, options [nop,nop,TS val 2074098279 ecr 2671667056], length 0 > 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, options [nop,nop,TS val 2671667056 ecr 2074098279], length 136: NFS request xid 730594197 132 getattr fh 0,1/53 > 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 1:49, ack 505, win 29128, options [nop,nop,TS val 2074098279 ecr 2671667056], length 48: NFS reply xid 697039765 reply ok 44 getattr ERROR: unk 10063 > > This error 10063 after the partition heals is also "bad news". It indicates the Session > (which is supposed to maintain "exactly once" RPC semantics is broken). I'll admit I > suspect a Linux client bug, but will be investigating further. > > So, hopefully TCP conversant folk can confirm if the above is correct behaviour > or if the RST should be ack'd sooner? > > I could also see this becoming a "forever" TCP battle for other versions of Linux client. > > rick > > > ________________________________________ > From: Scheffenegger, Richard > Sent: Sunday, April 4, 2021 7:50 AM > To: Rick Macklem; tuexen@freebsd.org > Cc: Youssef GHORBAL; freebsd-net@freebsd.org > Subject: Re: NFS Mount Hangs > > CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca > > > For what it?s worth, suse found two bugs in the linux nfconntrack (stateful firewall), and pfifo-fast scheduler, which could conspire to make tcp sessions hang forever. > > One is a missed updaten when the c?ient is not using the noresvport moint option, which makes tje firewall think rsts are illegal (and drop them); > > The fast scheduler can run into an issue if only a single packet should be forwarded (note that this is not the default scheduler, but often recommended for perf, as it runs lockless and lower cpu cost that pfq (default). If no other/additional packet pushes out that last packet of a flow, it can become stuck forever... > > I can try getting the relevant bug info next week... > > ________________________________ > Von: owner-freebsd-net@freebsd.org im Auftrag von Rick Macklem > Gesendet: Friday, April 2, 2021 11:31:01 PM > An: tuexen@freebsd.org > Cc: Youssef GHORBAL ; freebsd-net@freebsd.org > Betreff: Re: NFS Mount Hangs > > NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > > > tuexen@freebsd.org wrote: > >> On 2. Apr 2021, at 02:07, Rick Macklem wrote: > >> > >> I hope you don't mind a top post... > >> I've been testing network partitioning between the only Linux client > >> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch > >> (does soshutdown(..SHUT_WR) when it knows the socket is broken) > >> applied to it. > >> > >> I'm not enough of a TCP guy to know if this is useful, but here's what > >> I see... > >> > >> While partitioned: > >> On the FreeBSD server end, the socket either goes to CLOSED during > >> the network partition or stays ESTABLISHED. > >If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also > >sent a FIN, but you never called close() on the socket. > >If the socket stays in ESTABLISHED, there is no communication ongoing, > >I guess, and therefore the server does not even detect that the peer > >is not reachable. > >> On the Linux end, the socket seems to remain ESTABLISHED for a > >> little while, and then disappears. > >So how does Linux detect the peer is not reachable? > Well, here's what I see in a packet capture in the Linux client once > I partition it (just unplug the net cable): > - lots of retransmits of the same segment (with ACK) for 54sec > - then only ARP queries > > Once I plug the net cable back in: > - ARP works > - one more retransmit of the same segement > - receives RST from FreeBSD > ** So, is this now a "new" TCP connection, despite > using the same port#. > --> It matters for NFS, since "new connection" > implies "must retry all outstanding RPCs". > - sends SYN > - receives SYN, ACK from FreeBSD > --> connection starts working again > Always uses same port#. > > On the FreeBSD server end: > - receives the last retransmit of the segment (with ACK) > - sends RST > - receives SYN > - sends SYN, ACK > > I thought that there was no RST in the capture I looked at > yesterday, so I'm not sure if FreeBSD always sends an RST, > but the Linux client behaviour was the same. (Sent a SYN, etc). > The socket disappears from the Linux "netstat -a" and I > suspect that happens after about 54sec, but I am not sure > about the timing. > > >> > >> After unpartitioning: > >> On the FreeBSD server end, you get another socket showing up at > >> the same port# > >> Active Internet connections (including servers) > >> Proto Recv-Q Send-Q Local Address Foreign Address (state) > >> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 ESTABLISHED > >> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 CLOSED > >> > >> The Linux client shows the same connection ESTABLISHED. > But disappears from "netstat -a" for a while during the partitioning. > > >> (The mount sometimes reports an error. I haven't looked at packet > >> traces to see if it retries RPCs or why the errors occur.) > I have now done so, as above. > > >> --> However I never get hangs. > >> Sometimes it goes to SYN_SENT for a while and the FreeBSD server > >> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the > >> mount starts working again. > >> > >> The most obvious thing is that the Linux client always keeps using > >> the same port#. (The FreeBSD client will use a different port# when > >> it does a TCP reconnect after no response from the NFS server for > >> a little while.) > >> > >> What do those TCP conversant think? > >I guess you are you are never calling close() on the socket, for with > >the connection state is CLOSED. > Ok, that makes sense. For this case the Linux client has not done a > BindConnectionToSession to re-assign the back channel. > I'll have to bug them about this. However, I'll bet they'll answer > that I have to tell them the back channel needs re-assignment > or something like that. > > I am pretty certain they are broken, in that the client needs to > retry all outstanding RPCs. > > For others, here's the long winded version of this that I just > put on the phabricator review: > In the server side kernel RPC, the socket (struct socket *) is in a > structure called SVCXPRT (normally pointed to by "xprt"). > These structures a ref counted and the soclose() is done > when the ref. cnt goes to zero. My understanding is that > "struct socket *" is free'd by soclose() so this cannot be done > before the xprt ref. cnt goes to zero. > > For NFSv4.1/4.2 there is something called a back channel > which means that a "xprt" is used for server->client RPCs, > although the TCP connection is established by the client > to the server. > --> This back channel holds a ref cnt on "xprt" until the > > client re-assigns it to a different TCP connection > via an operation called BindConnectionToSession > and the Linux client is not doing this soon enough, > it appears. > > So, the soclose() is delayed, which is why I think the > TCP connection gets stuck in CLOSE_WAIT and that is > why I've added the soshutdown(..SHUT_WR) calls, > which can happen before the client gets around to > re-assigning the back channel. > > Thanks for your help with this Michael, rick > > Best regards > Michael > > > > rick > > ps: I can capture packets while doing this, if anyone has a use > > for them. > > > > > > > > > > > > > > ________________________________________ > > From: owner-freebsd-net@freebsd.org on behalf of Youssef GHORBAL > > Sent: Saturday, March 27, 2021 6:57 PM > > To: Jason Breitman > > Cc: Rick Macklem; freebsd-net@freebsd.org > > Subject: Re: NFS Mount Hangs > > > > CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca > > > > > > > > > > On 27 Mar 2021, at 13:20, Jason Breitman > wrote: > > > > The issue happened again so we can say that disabling TSO and LRO on the NIC did not resolve this issue. > > # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso > > # ifconfig lagg0 > > lagg0: flags=8943 metric 0 mtu 1500 > > options=8100b8 > > > > We can also say that the sysctl settings did not resolve this issue. > > > > # sysctl net.inet.tcp.fast_finwait2_recycle=1 > > net.inet.tcp.fast_finwait2_recycle: 0 -> 1 > > > > # sysctl net.inet.tcp.finwait2_timeout=1000 > > net.inet.tcp.finwait2_timeout: 60000 -> 1000 > > > > I don?t think those will do anything in your case since the FIN_WAIT2 are on the client side and those sysctls are for BSD. > > By the way it seems that Linux recycles automatically TCP sessions in FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) > > > > tcp_fin_timeout (integer; default: 60; since Linux 2.2) > > This specifies how many seconds to wait for a final FIN > > packet before the socket is forcibly closed. This is > > strictly a violation of the TCP specification, but > > required to prevent denial-of-service attacks. In Linux > > 2.2, the default value was 180. > > > > So I don?t get why it stucks in the FIN_WAIT2 state anyway. > > > > You really need to have a packet capture during the outage (client and server side) so you?ll get over the wire chat and start speculating from there. > > No need to capture the beginning of the outage for now. All you have to do, is run a tcpdump for 10 minutes or so when you notice a client stuck. > > > > * I have not rebooted the NFS Server nor have I restarted nfsd, but do not believe that is required as these settings are at the TCP level and I would expect new sessions to use the updated settings. > > > > The issue occurred after 5 days following a reboot of the client machines. > > I ran the capture information again to make use of the situation. > > > > #!/bin/sh > > > > while true > > do > > /bin/date >> /tmp/nfs-hang.log > > /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log > > /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log > > /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log > > /bin/sleep 60 > > done > > > > > > On the NFS Server > > Active Internet connections (including servers) > > Proto Recv-Q Send-Q Local Address Foreign Address (state) > > tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 CLOSE_WAIT > > > > On the NFS Client > > tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 FIN_WAIT2 > > > > > > > > You had also asked for the output below. > > > > # nfsstat -E -s > > BackChannelCtBindConnToSes > > 0 0 > > > > # sysctl vfs.nfsd.request_space_throttle_count > > vfs.nfsd.request_space_throttle_count: 0 > > > > I see that you are testing a patch and I look forward to seeing the results. > > > > > > Jason Breitman > > > > > > On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote: > > > > Youssef GHORBAL > wrote: > >> Hi Jason, > >> > >>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote: > >>> > >>> Please review the details below and let me know if there is a setting that I should apply to my FreeBSD NFS Server or if there is a bug fix that I can apply to resolve my issue. > >>> I shared this information with the linux-nfs mailing list and they believe the issue is on the server side. > >>> > >>> Issue > >>> NFSv4 mounts periodically hang on the NFS Client. > >>> > >>> During this time, it is possible to manually mount from another NFS Server on the NFS Client having issues. > >>> Also, other NFS Clients are successfully mounting from the NFS Server in question. > >>> Rebooting the NFS Client appears to be the only solution. > >> > >> I had experienced a similar weird situation with periodically stuck Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based but they seem to have there >own nfsd) > > Yes, my understanding is that Isilon uses a proprietary user space nfsd and > > not the kernel based RPC and nfsd in FreeBSD. > > > >> We?ve had better luck and we did manage to have packet captures on both sides >during the issue. The gist of it goes like follows: > >> > >> - Data flows correctly between SERVER and the CLIENT > >> - At some point SERVER starts decreasing it's TCP Receive Window until it reachs 0 > >> - The client (eager to send data) can only ack data sent by SERVER. > >> - When SERVER was done sending data, the client starts sending TCP Window >Probes hoping that the TCP Window opens again so he can flush its buffers. > >> - SERVER responds with a TCP Zero Window to those probes. > > Having the window size drop to zero is not necessarily incorrect. > > If the server is overloaded (has a backlog of NFS requests), it can stop doing > > soreceive() on the socket (so the socket rcv buffer can fill up and the TCP window > > closes). This results in "backpressure" to stop the NFS client from flooding the > > NFS server with requests. > > --> However, once the backlog is handled, the nfsd should start to soreceive() > > again and this shouls cause the window to open back up. > > --> Maybe this is broken in the socket/TCP code. I quickly got lost in > > tcp_output() when it decides what to do about the rcvwin. > > > >> - After 6 minutes (the NFS server default Idle timeout) SERVER racefully closes the >TCP connection sending a FIN Packet (and still a TCP Window 0) > > This probably does not happen for Jason's case, since the 6minute timeout > > is disabled when the TCP connection is assigned as a backchannel (most likely > > the case for NFSv4.1). > > > >> - CLIENT ACK that FIN. > >> - SERVER goes in FIN_WAIT_2 state > >> - CLIENT closes its half part part of the socket and goes in LAST_ACK state. > >> - FIN is never sent by the client since there still data in its SendQ and receiver TCP >Window is still 0. At this stage the client starts sending TCP Window Probes again >and again hoping that the server opens its TCP Window so it can flush it's buffers >and terminate its side of the socket. > >> - SERVER keeps responding with a TCP Zero Window to those probes. > >> => The last two steps goes on and on for hours/days freezing the NFS mount bound >to that TCP session. > >> > >> If we had a situation where CLIENT was responsible for closing the TCP Window (and >initiating the TCP FIN first) and server wanting to send data we?ll end up in the same >state as you I think. > >> > >> We?ve never had the root cause of why the SERVER decided to close the TCP >Window and no more acccept data, the fix on the Isilon part was to recycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwait2_recycle=1 & >net.inet.tcp.finwait2_timeout=5000). Once the socket recycled and at the next >occurence of CLIENT TCP Window probe, SERVER sends a RST, triggering the >teardown of the session on the client side, a new TCP handchake, etc and traffic >flows again (NFS starts responding) > >> > >> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 was >implemented on the Isilon side) we?ve added a check script on the client that detects >LAST_ACK sockets on the client and through iptables rule enforces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr --sport $local_port -j REJECT >--reject-with tcp-reset (the script removes this iptables rule as soon as the LAST_ACK >disappears) > >> > >> The bottom line would be to have a packet capture during the outage (client and/or >server side), it will show you at least the shape of the TCP exchange when NFS is >stuck. > > Interesting story and good work w.r.t. sluething, Youssef, thanks. > > > > I looked at Jason's log and it shows everything is ok w.r.t the nfsd threads. > > (They're just waiting for RPC requests.) > > However, I do now think I know why the soclose() does not happen. > > When the TCP connection is assigned as a backchannel, that takes a reference > > cnt on the structure. This refcnt won't be released until the connection is > > replaced by a BindConnectiotoSession operation from the client. But that won't > > happen until the client creates a new TCP connection. > > --> No refcnt release-->no refcnt of 0-->no soclose(). > > > > I've created the attached patch (completely different from the previous one) > > that adds soshutdown(SHUT_WR) calls in the three places where the TCP > > connection is going away. This seems to get it past CLOSE_WAIT without a > > soclose(). > > --> I know you are not comfortable with patching your server, but I do think > > this change will get the socket shutdown to complete. > > > > There are a couple more things you can check on the server... > > # nfsstat -E -s > > --> Look for the count under "BindConnToSes". > > --> If non-zero, backchannels have been assigned > > # sysctl -a | fgrep request_space_throttle_count > > --> If non-zero, the server has been overloaded at some point. > > > > I think the attached patch might work around the problem. > > The code that should open up the receive window needs to be checked. > > I am also looking at enabling the 6minute timeout when a backchannel is > > assigned. > > > > rick > > > > Youssef > > > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/freebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZciLj4ObR_drJNv5yryaERfIbhKR2d$ > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > > > > > > > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > -- Rod Grimes rgrimes@freebsd.org From owner-freebsd-net@freebsd.org Sun Apr 4 20:29:04 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id ABEAD5B393A for ; Sun, 4 Apr 2021 20:29:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on0607.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::607]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FD53g23Zwz3wH2; Sun, 4 Apr 2021 20:29:01 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NdkXkc67Xhop8Z72E/eAKBCvMH6IwQrvyQTZ9GCVWbo8idu/dYkdvXjIavjMIDbd83AjI9exisWGpQUdwdTxoSzKWp3FNTjMDK++zd0QIipZeBhY04/ax7bQ0C3yjFVSRHHys0IoNV8AsAIXDMNB8qBZs/U5T4P1g/nRvcJYjzBWd3Hx+flrYVn1D3FkO8X3eYdLa/WISom3EI/MoTGnGBjdyYTFjTJNydTxAWRmxSelOi3jFhO/c2N44vqeaQkMT1dzDRmYi5ZbH30+uQ9PN+Z7rPhoPQ6aqtHXkeHdnuzkTbQncxRs27NwsooEX6jrjhg+fPnICysJNluHe0UDjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AKwomWjtRuKh6pGBHKrBdE0h7OGgBTX6hB/toADrtb8=; b=eysVz5ZM1PoI3MOXNZFx47rb45yiUW3BwzymjjpWL4pbs3qDTVBoRjoTwVxn1vwkmEq6u2DaksI5yKhZ/w2MMDulhio8c4Oo3XNyTyuO0qGTBJvIxl9TDKnXqO3dMRvDPdtCJ4o7vW0Mm2vEF2gahNuJVkRnFHTrDL55HcegZsctsFKsG6Ach+4Gv3XL9RThl/6xUjqG/pLzR7ztJ5MeuAAWCivV9gNODsSeIRENovnGVtNayfgCW9D1JLRW4/ix9q5vMJBB7f2MGngijlcymCGVuJ0paUygs95qE5BRbEQYHOhTglbToN8lv9OKbOIdj4XJslAibDGzXytc7TvmlA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AKwomWjtRuKh6pGBHKrBdE0h7OGgBTX6hB/toADrtb8=; b=JJBCB1TtVmlgW+dXt49chcXaKiWCyv0LWEej5cfL02gcTdX5es3WULm03jg+YlAW/plEJvF2T8n7fVbGTzWsj2paqqCxzsZpQPobSyEwQBnU9vwQ09AzOY3wnk/YL6BPX+NAI6J0+a3CPIgEi+VhpZvHZ0qb36EsKlydmFWrZBZ5+V/fcK6D3X8EX1ZaNt5/3dS5VJ3ApsqUKLSMfMEutWqLWaI4A0lYV0FhiWLmJxOpWK1FxPAVEfa2wfo+icP56YJVumdKbGjA8LpWl4/KooL89Rxix5qM7hJCEvzXlpqMS6LgF481Wr+sKfi/CP981lEfSqFUhML+24qmj+Ow7w== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR01MB4309.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c01:7::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.32; Sun, 4 Apr 2021 20:29:00 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.032; Sun, 4 Apr 2021 20:28:59 +0000 From: Rick Macklem To: "tuexen@freebsd.org" CC: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAG5EAgAA+DUI= Date: Sun, 4 Apr 2021 20:28:59 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> , <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> In-Reply-To: <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 322ff768-08d7-4a6d-edc3-08d8f7a8472a x-ms-traffictypediagnostic: YQXPR01MB4309: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:226; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: sCpYPYnqoypxkBaXVrJalhgAmzsYhNit5PRcwfk50dNlCq1MruHRUJRMwSEaJk1NeGBRNEEju+kMcxm2ays695FyZR0QpkYKSZ3HoKcfPH9W55LLhqbzQmB+Xksoi/J8o2jEM6InIN47Br+vSANF9dUi/PRHcJ7JD9yrKJj8myhtMZy6T3A3szJCfDV+y4ipz0l05/2XKAj/OKgZmHs/jMPSkjvT9hkIGG0yn6xxQiPC3ZDfNf+sApdcXvtYdQTT2551Pk7uN+vQ41MbWykQ2vJe2LU02p4zpcrV6B7OSr/HGsCRZ4ZNNtImnbhWNzzGofGQW8RJqHfrm5vsV/SHXGu7AMcqNFSgjHXElrHEGAdke+bfmyRisixGqdfd90ZOmt7qZqZYoPI9P0l6l48w6mxAzDyPae+CNZK6h6fkwJ6gtxndWaZIni33ZfCpT+RCuER5sVOIv1XSyHpV5T6/Q2rdHwrlkAUX43zqOAZRxmx4yXH5SAhn5y4sYKOiKdTmCcTWJjh9cpX6i0r3cb2Tx3T1ZOrV/BpHTUjUfup8eRe2HCmzBLOQ6zxZuKDVaYnoJQF9/muLHz6mbKOlRg9qf3TGYpAe7XJ9vVvMF273//ST9QAxvPU9zqBK5N+HOiq8EpNCzoCIaU8ITsja1RngBOEdYr+SalJbdwrdEWKtyYdSHk0WepAlw9FPYXpz6tPzdux9kJ+lLEUq8UHZiHOUizanASRHc2u7t6nwhUWbn28= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(376002)(136003)(346002)(396003)(366004)(39850400004)(7696005)(8676002)(33656002)(38100700001)(316002)(83380400001)(8936002)(53546011)(6506007)(9686003)(4326008)(55016002)(2906002)(966005)(478600001)(3480700007)(86362001)(91956017)(186003)(7116003)(71200400001)(64756008)(66556008)(66946007)(66446008)(5660300002)(66476007)(54906003)(30864003)(52536014)(6916009)(76116006)(786003)(66574015)(579004)(559001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?yiCrLdRUG5VjbHhIciHpAyMSMR3hFbR4QOq4YoWB4/8wz+BMxVJfiswH?= =?Windows-1252?Q?u+/hmq1s79MyDpbkfEFUzP2Ny0+a9bZriSaTi1sBXniUJbLQ7bPv2wK2?= =?Windows-1252?Q?HC52E2c7QhiZiAbvGfZLyG7ls4drRdizQIbG1eAVld9v35RsUHaaOOeb?= =?Windows-1252?Q?qsLhrHp6MsFAwveCxQpC9Mp4gpfR+q0gmVIBQ6siTV69FUwKWk+NJnZK?= =?Windows-1252?Q?3JZI7M38yhYgnwMKkoWGkEuRRMMkRCGEN88vo5SmqXABILTFI06qe19K?= =?Windows-1252?Q?e83Itq5Z9no0f2F86tTc4fUblrpsUnsA14Mkp9LBXJh+j7wu+oXkWlaB?= =?Windows-1252?Q?i+84aeb/OWbaWYYlAivdM9FSo+waydJQcTbTMCMnJfXI2u045NBd0MQ/?= =?Windows-1252?Q?fCtGkjDifn9cbe4q323E+euecZEQ0Dmq/SpsTHUKDMATADXYxs9HPQYj?= =?Windows-1252?Q?rtyfPL43buWb4rZyRA5+6TAKvQ2Jtx1Tz/4r6mF3Bb7lgGWSzga6xJVR?= =?Windows-1252?Q?BPMC28by62FvN2usfNC6WSmxZMKghTR5kTZWpDi7Hl6aqJgR1nGxP2Hh?= =?Windows-1252?Q?Cq35qInYVfbt450rKeeW/NtDvZyor+lKzH/y2tk913FjbZpXuJl4Bd70?= =?Windows-1252?Q?tryPofnGruHFSFuowJHNQu1boehW8ZkGrQPYU7dmXt69O2pscAXEUvmL?= =?Windows-1252?Q?d5iauM3lhQyW9nNc5DUpqWtIEpwHMD7cS6GLpO9L0aTws/9+1c8xNAst?= =?Windows-1252?Q?8VVs0l1AHp+IDKqytX3X2BDnIK/jdVmhCHGdiK0V4WUo9HfjrSJgGrlS?= =?Windows-1252?Q?1PP00VOIa7l+5c+IupuvgYQPci+8wujSMXaHz554p77ZGNXKEUvgidwC?= =?Windows-1252?Q?StsBMVvSiIXoKwO9EVrekmCnGFvV/PO5/CJEm/bBb6Ce34xWZXhfi/S3?= =?Windows-1252?Q?AYyqHZhFMISC2OV7u24bW8MjZAOsU58XS2L+3h8SAeVN4y8HElTpyHS8?= =?Windows-1252?Q?NR+ldm8L0mCNh81yvf6ier8sPUPcIqnSjmHMn1PGlVCmoMTyiq9VViZ8?= =?Windows-1252?Q?5gvgrtvn9yhQcwhltYP5lHfIne5t4efuarhQdHrFk+3F2rRQLVDWaGpt?= =?Windows-1252?Q?TLyt4E0lAS/ACjz/R8qlG6xIwecvdsWL4CGtgypp8eUF011fCC+JzDHP?= =?Windows-1252?Q?o5ebseUhKIy/XV3Dxy8qZRrdLhLElPHgwdjfSTgY/sfif9o46fLUX46s?= =?Windows-1252?Q?wcOp4Tdhf7H41pIJ7F/tUtgKlSZuIiyfcn2+WbiBlfLomnSkyga6UFw9?= =?Windows-1252?Q?ibi6zK790rCqDzZtbD3cQlq96mCiSxAfh4g1C9JZVefcCvIIKMqsCsmG?= =?Windows-1252?Q?bKa3u1ZBG3PVtxvl4J1Q0YKOoFIWFYmkA2BctbYkBTAn21u8svXpVQn+?= =?Windows-1252?Q?VxqmXe2u0r5vU9A1Cldu31XPcqsesE+OLXipBqkvY28=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 322ff768-08d7-4a6d-edc3-08d8f7a8472a X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Apr 2021 20:28:59.1818 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: cW2DwbJthVTnHstUzPSbR+19oF53V8dPWWT/R0xdHjcPh+7aLZcKDJVY3vxFL+h2Od+GwfFQxc47VhgCr1Kb3Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR01MB4309 X-Rspamd-Queue-Id: 4FD53g23Zwz3wH2 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=JJBCB1Tt; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::607 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5d::607:from]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5d::607:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 20:29:04 -0000 Oops, yes the packet capture is on freefall (forgot to mention that;-). You should be able to: % fetch https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap Some useful packet #s are: 1949 - partitioning starts 2005 - partition healed 2060 - last RST 2067 - SYN -> gets going again This was taken at the Linux end. I have FreeBSD end too, although I don't think it tells you anything more. Have fun with it, rick ________________________________________ From: tuexen@freebsd.org Sent: Sunday, April 4, 2021 12:41 PM To: Rick Macklem Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org Subject: Re: NFS Mount Hangs CAUTION: This email originated from outside of the University of Guelph. Do= not click links or open attachments unless you recognize the sender and kn= ow the content is safe. If in doubt, forward suspicious emails to IThelp@uo= guelph.ca > On 4. Apr 2021, at 17:27, Rick Macklem wrote: > > Well, I'm going to cheat and top post, since this is elated info. and > not really part of the discussion... > > I've been testing network partitioning between a Linux client (5.2 kernel= ) > and a FreeBSD-current NFS server. I have not gotten a solid hang, but > I have had the Linux client doing "battle" with the FreeBSD server for > several minutes after un-partitioning the connection. > > The battle basically consists of the Linux client sending an RST, followe= d > by a SYN. > The FreeBSD server ignores the RST and just replies with the same old ack= . > --> This varies from "just a SYN" that succeeds to 100+ cycles of the abo= ve > over several minutes. > > I had thought that an RST was a "pretty heavy hammer", but FreeBSD seems > pretty good at ignoring it. > > A full packet capture of one of these is in /home/rmacklem/linuxtofreenfs= .pcap > in case anyone wants to look at it. On freefall? I would like to take a look at it... Best regards Michael > > Here's a tcpdump snippet of the interesting part (see the *** comments): > 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [P.], seq 202585:202749, ack 212293, win 29128, options [nop,n= op,TS val 2073636037 ecr 2671204825], length 164: NFS reply xid 613153685 r= eply ok 160 getattr NON 4 ids 0/33554432 sz 0 > 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [.], ack 202749, win 501, options [nop,nop,TS val 2671204825 e= cr 2073636037], length 0 > *** Network is now partitioned... > > 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop= ,TS val 2671204927 ecr 2073636037], length 232: NFS request xid 629930901 2= 28 getattr fh 0,1/53 > 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop= ,TS val 2671205135 ecr 2073636037], length 232: NFS request xid 629930901 2= 28 getattr fh 0,1/53 > 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop= ,TS val 2671205343 ecr 2073636037], length 232: NFS request xid 629930901 2= 28 getattr fh 0,1/53 > *** Lots of lines snipped. > > > 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28 > 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28 > 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28 > 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28 > 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28 > *** Network is now unpartitioned... > > 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 (= oui Unknown), length 46 > 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val= 2671421871 ecr 0,nop,wscale 7], length 0 > 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073855137= ecr 2671204825], length 0 > 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [R], seq 964161458, win 0, length 0 > 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val= 2671424943 ecr 0,nop,wscale 7], length 0 > 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073856161= ecr 2671204825], length 0 > *** This "battle" goes on for 223sec... > I snipped out 13 cycles of this "Linux sends an RST, followed by SYN" > "FreeBSD replies with same old ACK". In another test run I saw this > cycle continue non-stop for several minutes. This time, the Linux > client paused for a while (see ARPs below). > > 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [R], seq 964161458, win 0, length 0 > 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val= 2671428975 ecr 0,nop,wscale 7], length 0 > 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073860193= ecr 2671204825], length 0 > *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. > > 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28 > 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 (= oui Unknown), length 46 > 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1.= 254, length 46 > 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1.= 254, length 46 > 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS v= al 2074046846 ecr 2671204825], length 0 > 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS v= al 2074062406 ecr 2671204825], length 0 > 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [R.], seq 202750, ack 212293, win 0, options [nop,nop,TS val 2= 074077981 ecr 2671204825], length 0 > *** FreeBSD finally acknowledges the RST 38sec after Linux sent the last > of 13 (100+ for another test run). > > 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [S], seq 4247692373, win 64240, options [mss 1460,sackOK,TS va= l 2671667055 ecr 0,nop,wscale 7], length 0 > 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [S.], seq 661237469, ack 4247692374, win 65535, options [mss 1= 460,nop,wscale 6,sackOK,TS val 2074098278 ecr 2671667055], length 0 > *** Now back in business... > > 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [.], ack 1, win 502, options [nop,nop,TS val 2671667055 ecr 20= 74098278], length 0 > 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 1:233, ack 1, win 502, options [nop,nop,TS val 26716= 67056 ecr 2074098278], length 232: NFS request xid 629930901 228 getattr fh= 0,1/53 > 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 233:505, ack 1, win 502, options [nop,nop,TS val 267= 1667056 ecr 2074098278], length 272: NFS request xid 697039765 132 getattr = fh 0,1/53 > 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [.], ack 505, win 29128, options [nop,nop,TS val 2074098279 ec= r 2671667056], length 0 > 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 505:641, ack 1, win 502, options [nop,nop,TS val 267= 1667056 ecr 2074098279], length 136: NFS request xid 730594197 132 getattr = fh 0,1/53 > 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [P.], seq 1:49, ack 505, win 29128, options [nop,nop,TS val 20= 74098279 ecr 2671667056], length 48: NFS reply xid 697039765 reply ok 44 ge= tattr ERROR: unk 10063 > > This error 10063 after the partition heals is also "bad news". It indicat= es the Session > (which is supposed to maintain "exactly once" RPC semantics is broken). I= 'll admit I > suspect a Linux client bug, but will be investigating further. > > So, hopefully TCP conversant folk can confirm if the above is correct beh= aviour > or if the RST should be ack'd sooner? > > I could also see this becoming a "forever" TCP battle for other versions = of Linux client. > > rick > > > ________________________________________ > From: Scheffenegger, Richard > Sent: Sunday, April 4, 2021 7:50 AM > To: Rick Macklem; tuexen@freebsd.org > Cc: Youssef GHORBAL; freebsd-net@freebsd.org > Subject: Re: NFS Mount Hangs > > CAUTION: This email originated from outside of the University of Guelph. = Do not click links or open attachments unless you recognize the sender and = know the content is safe. If in doubt, forward suspicious emails to IThelp@= uoguelph.ca > > > For what it=91s worth, suse found two bugs in the linux nfconntrack (stat= eful firewall), and pfifo-fast scheduler, which could conspire to make tcp = sessions hang forever. > > One is a missed updaten when the c=F6ient is not using the noresvport moi= nt option, which makes tje firewall think rsts are illegal (and drop them); > > The fast scheduler can run into an issue if only a single packet should b= e forwarded (note that this is not the default scheduler, but often recomme= nded for perf, as it runs lockless and lower cpu cost that pfq (default). I= f no other/additional packet pushes out that last packet of a flow, it can = become stuck forever... > > I can try getting the relevant bug info next week... > > ________________________________ > Von: owner-freebsd-net@freebsd.org im Auf= trag von Rick Macklem > Gesendet: Friday, April 2, 2021 11:31:01 PM > An: tuexen@freebsd.org > Cc: Youssef GHORBAL ; freebsd-net@freebsd.org= > Betreff: Re: NFS Mount Hangs > > NetApp Security WARNING: This is an external email. Do not click links or= open attachments unless you recognize the sender and know the content is s= afe. > > > > > tuexen@freebsd.org wrote: >>> On 2. Apr 2021, at 02:07, Rick Macklem wrote: >>> >>> I hope you don't mind a top post... >>> I've been testing network partitioning between the only Linux client >>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch >>> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >>> applied to it. >>> >>> I'm not enough of a TCP guy to know if this is useful, but here's what >>> I see... >>> >>> While partitioned: >>> On the FreeBSD server end, the socket either goes to CLOSED during >>> the network partition or stays ESTABLISHED. >> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also >> sent a FIN, but you never called close() on the socket. >> If the socket stays in ESTABLISHED, there is no communication ongoing, >> I guess, and therefore the server does not even detect that the peer >> is not reachable. >>> On the Linux end, the socket seems to remain ESTABLISHED for a >>> little while, and then disappears. >> So how does Linux detect the peer is not reachable? > Well, here's what I see in a packet capture in the Linux client once > I partition it (just unplug the net cable): > - lots of retransmits of the same segment (with ACK) for 54sec > - then only ARP queries > > Once I plug the net cable back in: > - ARP works > - one more retransmit of the same segement > - receives RST from FreeBSD > ** So, is this now a "new" TCP connection, despite > using the same port#. > --> It matters for NFS, since "new connection" > implies "must retry all outstanding RPCs". > - sends SYN > - receives SYN, ACK from FreeBSD > --> connection starts working again > Always uses same port#. > > On the FreeBSD server end: > - receives the last retransmit of the segment (with ACK) > - sends RST > - receives SYN > - sends SYN, ACK > > I thought that there was no RST in the capture I looked at > yesterday, so I'm not sure if FreeBSD always sends an RST, > but the Linux client behaviour was the same. (Sent a SYN, etc). > The socket disappears from the Linux "netstat -a" and I > suspect that happens after about 54sec, but I am not sure > about the timing. > >>> >>> After unpartitioning: >>> On the FreeBSD server end, you get another socket showing up at >>> the same port# >>> Active Internet connections (including servers) >>> Proto Recv-Q Send-Q Local Address Foreign Address (stat= e) >>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 ESTAB= LISHED >>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 CLOSE= D >>> >>> The Linux client shows the same connection ESTABLISHED. > But disappears from "netstat -a" for a while during the partitioning. > >>> (The mount sometimes reports an error. I haven't looked at packet >>> traces to see if it retries RPCs or why the errors occur.) > I have now done so, as above. > >>> --> However I never get hangs. >>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server >>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >>> mount starts working again. >>> >>> The most obvious thing is that the Linux client always keeps using >>> the same port#. (The FreeBSD client will use a different port# when >>> it does a TCP reconnect after no response from the NFS server for >>> a little while.) >>> >>> What do those TCP conversant think? >> I guess you are you are never calling close() on the socket, for with >> the connection state is CLOSED. > Ok, that makes sense. For this case the Linux client has not done a > BindConnectionToSession to re-assign the back channel. > I'll have to bug them about this. However, I'll bet they'll answer > that I have to tell them the back channel needs re-assignment > or something like that. > > I am pretty certain they are broken, in that the client needs to > retry all outstanding RPCs. > > For others, here's the long winded version of this that I just > put on the phabricator review: > In the server side kernel RPC, the socket (struct socket *) is in a > structure called SVCXPRT (normally pointed to by "xprt"). > These structures a ref counted and the soclose() is done > when the ref. cnt goes to zero. My understanding is that > "struct socket *" is free'd by soclose() so this cannot be done > before the xprt ref. cnt goes to zero. > > For NFSv4.1/4.2 there is something called a back channel > which means that a "xprt" is used for server->client RPCs, > although the TCP connection is established by the client > to the server. > --> This back channel holds a ref cnt on "xprt" until the > > client re-assigns it to a different TCP connection > via an operation called BindConnectionToSession > and the Linux client is not doing this soon enough, > it appears. > > So, the soclose() is delayed, which is why I think the > TCP connection gets stuck in CLOSE_WAIT and that is > why I've added the soshutdown(..SHUT_WR) calls, > which can happen before the client gets around to > re-assigning the back channel. > > Thanks for your help with this Michael, rick > > Best regards > Michael >> >> rick >> ps: I can capture packets while doing this, if anyone has a use >> for them. >> >> >> >> >> >> >> ________________________________________ >> From: owner-freebsd-net@freebsd.org on b= ehalf of Youssef GHORBAL >> Sent: Saturday, March 27, 2021 6:57 PM >> To: Jason Breitman >> Cc: Rick Macklem; freebsd-net@freebsd.org >> Subject: Re: NFS Mount Hangs >> >> CAUTION: This email originated from outside of the University of Guelph.= Do not click links or open attachments unless you recognize the sender and= know the content is safe. If in doubt, forward suspicious emails to IThelp= @uoguelph.ca >> >> >> >> >> On 27 Mar 2021, at 13:20, Jason Breitman > wrote: >> >> The issue happened again so we can say that disabling TSO and LRO on the= NIC did not resolve this issue. >> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso >> # ifconfig lagg0 >> lagg0: flags=3D8943 metr= ic 0 mtu 1500 >> options=3D8100b8 >> >> We can also say that the sysctl settings did not resolve this issue. >> >> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 >> net.inet.tcp.fast_finwait2_recycle: 0 -> 1 >> >> # sysctl net.inet.tcp.finwait2_timeout=3D1000 >> net.inet.tcp.finwait2_timeout: 60000 -> 1000 >> >> I don=92t think those will do anything in your case since the FIN_WAIT2 = are on the client side and those sysctls are for BSD. >> By the way it seems that Linux recycles automatically TCP sessions in FI= N_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) >> >> tcp_fin_timeout (integer; default: 60; since Linux 2.2) >> This specifies how many seconds to wait for a final FIN >> packet before the socket is forcibly closed. This is >> strictly a violation of the TCP specification, but >> required to prevent denial-of-service attacks. In Linux >> 2.2, the default value was 180. >> >> So I don=92t get why it stucks in the FIN_WAIT2 state anyway. >> >> You really need to have a packet capture during the outage (client and s= erver side) so you=92ll get over the wire chat and start speculating from t= here. >> No need to capture the beginning of the outage for now. All you have to = do, is run a tcpdump for 10 minutes or so when you notice a client stuck. >> >> * I have not rebooted the NFS Server nor have I restarted nfsd, but do n= ot believe that is required as these settings are at the TCP level and I wo= uld expect new sessions to use the updated settings. >> >> The issue occurred after 5 days following a reboot of the client machine= s. >> I ran the capture information again to make use of the situation. >> >> #!/bin/sh >> >> while true >> do >> /bin/date >> /tmp/nfs-hang.log >> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log >> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log >> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log >> /bin/sleep 60 >> done >> >> >> On the NFS Server >> Active Internet connections (including servers) >> Proto Recv-Q Send-Q Local Address Foreign Address (state= ) >> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT >> >> On the NFS Client >> tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2 >> >> >> >> You had also asked for the output below. >> >> # nfsstat -E -s >> BackChannelCtBindConnToSes >> 0 0 >> >> # sysctl vfs.nfsd.request_space_throttle_count >> vfs.nfsd.request_space_throttle_count: 0 >> >> I see that you are testing a patch and I look forward to seeing the resu= lts. >> >> >> Jason Breitman >> >> >> On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote: >> >> Youssef GHORBAL > wrote: >>> Hi Jason, >>> >>>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote: >>>> >>>> Please review the details below and let me know if there is a setting = that I should apply to my FreeBSD NFS Server or if there is a bug fix that = I can apply to resolve my issue. >>>> I shared this information with the linux-nfs mailing list and they bel= ieve the issue is on the server side. >>>> >>>> Issue >>>> NFSv4 mounts periodically hang on the NFS Client. >>>> >>>> During this time, it is possible to manually mount from another NFS Se= rver on the NFS Client having issues. >>>> Also, other NFS Clients are successfully mounting from the NFS Server = in question. >>>> Rebooting the NFS Client appears to be the only solution. >>> >>> I had experienced a similar weird situation with periodically stuck Lin= ux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based but th= ey seem to have there >own nfsd) >> Yes, my understanding is that Isilon uses a proprietary user space nfsd = and >> not the kernel based RPC and nfsd in FreeBSD. >> >>> We=92ve had better luck and we did manage to have packet captures on bo= th sides >during the issue. The gist of it goes like follows: >>> >>> - Data flows correctly between SERVER and the CLIENT >>> - At some point SERVER starts decreasing it's TCP Receive Window until = it reachs 0 >>> - The client (eager to send data) can only ack data sent by SERVER. >>> - When SERVER was done sending data, the client starts sending TCP Wind= ow >Probes hoping that the TCP Window opens again so he can flush its buffe= rs. >>> - SERVER responds with a TCP Zero Window to those probes. >> Having the window size drop to zero is not necessarily incorrect. >> If the server is overloaded (has a backlog of NFS requests), it can stop= doing >> soreceive() on the socket (so the socket rcv buffer can fill up and the = TCP window >> closes). This results in "backpressure" to stop the NFS client from floo= ding the >> NFS server with requests. >> --> However, once the backlog is handled, the nfsd should start to sorec= eive() >> again and this shouls cause the window to open back up. >> --> Maybe this is broken in the socket/TCP code. I quickly got lost in >> tcp_output() when it decides what to do about the rcvwin. >> >>> - After 6 minutes (the NFS server default Idle timeout) SERVER racefull= y closes the >TCP connection sending a FIN Packet (and still a TCP Window 0= ) >> This probably does not happen for Jason's case, since the 6minute timeou= t >> is disabled when the TCP connection is assigned as a backchannel (most l= ikely >> the case for NFSv4.1). >> >>> - CLIENT ACK that FIN. >>> - SERVER goes in FIN_WAIT_2 state >>> - CLIENT closes its half part part of the socket and goes in LAST_ACK s= tate. >>> - FIN is never sent by the client since there still data in its SendQ a= nd receiver TCP >Window is still 0. At this stage the client starts sending= TCP Window Probes again >and again hoping that the server opens its TCP Wi= ndow so it can flush it's buffers >and terminate its side of the socket. >>> - SERVER keeps responding with a TCP Zero Window to those probes. >>> =3D> The last two steps goes on and on for hours/days freezing the NFS = mount bound >to that TCP session. >>> >>> If we had a situation where CLIENT was responsible for closing the TCP = Window (and >initiating the TCP FIN first) and server wanting to send data = we=92ll end up in the same >state as you I think. >>> >>> We=92ve never had the root cause of why the SERVER decided to close the= TCP >Window and no more acccept data, the fix on the Isilon part was to re= cycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwait2= _recycle=3D1 & >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recy= cled and at the next >occurence of CLIENT TCP Window probe, SERVER sends a = RST, triggering the >teardown of the session on the client side, a new TCP = handchake, etc and traffic >flows again (NFS starts responding) >>> >>> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 was= >implemented on the Isilon side) we=92ve added a check script on the clien= t that detects >LAST_ACK sockets on the client and through iptables rule en= forces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr --s= port $local_port -j REJECT >--reject-with tcp-reset (the script removes thi= s iptables rule as soon as the LAST_ACK >disappears) >>> >>> The bottom line would be to have a packet capture during the outage (cl= ient and/or >server side), it will show you at least the shape of the TCP e= xchange when NFS is >stuck. >> Interesting story and good work w.r.t. sluething, Youssef, thanks. >> >> I looked at Jason's log and it shows everything is ok w.r.t the nfsd thr= eads. >> (They're just waiting for RPC requests.) >> However, I do now think I know why the soclose() does not happen. >> When the TCP connection is assigned as a backchannel, that takes a refer= ence >> cnt on the structure. This refcnt won't be released until the connection= is >> replaced by a BindConnectiotoSession operation from the client. But that= won't >> happen until the client creates a new TCP connection. >> --> No refcnt release-->no refcnt of 0-->no soclose(). >> >> I've created the attached patch (completely different from the previous = one) >> that adds soshutdown(SHUT_WR) calls in the three places where the TCP >> connection is going away. This seems to get it past CLOSE_WAIT without a >> soclose(). >> --> I know you are not comfortable with patching your server, but I do t= hink >> this change will get the socket shutdown to complete. >> >> There are a couple more things you can check on the server... >> # nfsstat -E -s >> --> Look for the count under "BindConnToSes". >> --> If non-zero, backchannels have been assigned >> # sysctl -a | fgrep request_space_throttle_count >> --> If non-zero, the server has been overloaded at some point. >> >> I think the attached patch might work around the problem. >> The code that should open up the receive window needs to be checked. >> I am also looking at enabling the 6minute timeout when a backchannel is >> assigned. >> >> rick >> >> Youssef >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/f= reebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZc= iLj4ObR_drJNv5yryaERfIbhKR2d$ >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> >> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Sun Apr 4 20:47:19 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 81AFA5B4072 for ; Sun, 4 Apr 2021 20:47:19 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-qb1can01on062a.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5c::62a]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FD5Sl1hjhz4RLm; Sun, 4 Apr 2021 20:47:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MHt0Rufw25cadVDUy/Yfpw2sFU5JRVgGK06sclahxFCBVw5poF65dcShD3VXKXPTtrKFEiXixwuQHHP7sXJErkC4IfcpSw6QkTPhNSi1u6ZoWoxZgMuYviYkQCinw2sU+pN8GhR1ucuj5SkRCxVUUy6i1G1ucB5oKyfB2Yr55qXzp79VOQlEvol9CC56ENoJwruPn6Qjwu+iVnZnI8PvgZufH4snJDoDmNfiSV0XCWp7pjuC772YN7LR2IqS17d1AH3gWSBj9JIGhJIliiDj5YHOMdRV1itTcCYFAuNuJr1uzA2KXhzdSY3H+0hcre6y+7IPBYtGBLZXu0Y9liw8dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=C82Q804gwkVN+L/VntAaAPEoX1v/7p3qXxryl/+DXLg=; b=UHTPmzyd52qIhREJZ6kyTeB/FYtLoqrwIdexg7Ijp3heTUQDR+8Vwdsc8vX5Cbchj/NBiCgxfIn2piGFmNYad3V1KIo/zJGpnFjGqx8kioxj2OXCwVDiPy5vGtrIRV3+xIlq/OMa/Clp6EoBZnJUciB3rNd36e7g16DwtERfwBfnT19T/n/XSMTtLAa4REGJ8/21W9Bho/XGchcyXN0soGVDo3Pv0MhQPLf18K6OMutEPA7qCdOwYLdVP4TYG95TZ/f84l73VQ1gckhmrvEfT8wnIex3tue21slbes62p1+XdqBxyglcfEf7Iyg/0PWxMT6M0XdBgM8KAuxBgB7w7A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=C82Q804gwkVN+L/VntAaAPEoX1v/7p3qXxryl/+DXLg=; b=kp3nmtzh3t+v1pdXcBNcnCidb1LbSAU+7X4Iln9vWpnsu0BJLCq9W9RS/TOzsOASI5oymo1K5RwIZ6RIclZ00JLsz55yKw7d798hHBL43ghwJefzkGrLFsvYBJYoffKm9zf5fk2hTLXgEpEM1vMSLgwy5cUT6y9Ekw+sJVPQmllzfg5c6GiZH5x8hUGc29I3VvUrsAyotfJK+ieFI70SLmQsiHFObm6db6fnJp0+653HQr5BibJ/h4cYUtFzIWkYqQpQGi6EKmWWDSzeNwyDVRWJT853coUd8ub+A0VYCW2Bt2rnQnd91YvRmPOOXa9ZMET+9dNHAZlSu5JcsZagbQ== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR0101MB1622.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:1a::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.32; Sun, 4 Apr 2021 20:47:16 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.032; Sun, 4 Apr 2021 20:47:16 +0000 From: Rick Macklem To: "Rodney W. Grimes" CC: "Scheffenegger, Richard" , "tuexen@freebsd.org" , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAKFYAgAAzKOI= Date: Sun, 4 Apr 2021 20:47:16 +0000 Message-ID: References: , <202104041727.134HRTbA097115@gndrsh.dnsmgr.net> In-Reply-To: <202104041727.134HRTbA097115@gndrsh.dnsmgr.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 8463296f-960a-40c4-6e29-08d8f7aad500 x-ms-traffictypediagnostic: YQXPR0101MB1622: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:93; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: QCtGdpQm2JQYX+FgUXIHnubT88YvDxbQl+9JY01gvUysNl+gq91fg7l7njWnq3MMS79j4rxvil/4/qRVUn4mEFsMeClHRtI+hTE9YoYjmztkn/Aw6Ta/VkZsBfU6EwpnBYRrvfFqP6y1421yTf4b1ZKIwrzm3/evH4qslae5qT/MOdKEnJq6L5zD5MjYemxXUIm/7IRokFsd4nedDqNrfGOwztBI3INpTLoe6CnHMyKkRXaDmo605vX44eY5wbRfXPLR/2SZpjK38RBMsa6+XvgJqyvuMySIdcHnkIA3VJlHaebmRV07sGYMdFVTVpzNg7kVSVL1rTzvK8lzs6K+Jl/uwTGAmNWekhxo/FrZJicmQbd8z0nGvlgPGqmCZB2H/nlhqm6iGZhQFxDTNKloKTy2cnckVz22ahB4NvqWED9i4fRLHYPO7lJRY1pDoklSR2PAzD6Yjjsedcabyis1a1SC3BQSKjCqo+f2wKon2MjG+akvOgp03DpNQBYOqTB15Sbm+WOktQee4ZzIeLfMJiHru6O3tJli7fqGn0RG3wekOuaGLL9XU76dgka4LyBzxffowPcxuetUq88K1ksBNKadOvbbfpsDxeznBdKW3+D17c6zhUdqNv1LY4F2ZdtrxEsyFaNXoorFFmHco0sc1Hch7G7SaQCk2ao6EdORknoGHdW/b8KO3ws/mkcXz9m249R1XjrPtt4E19uXBVvQxuZQQL8q5a2+GpSQQdPLRjM= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(396003)(376002)(366004)(39850400004)(136003)(54906003)(71200400001)(66574015)(66556008)(316002)(786003)(66476007)(186003)(66946007)(76116006)(91956017)(52536014)(6916009)(5660300002)(9686003)(86362001)(66446008)(64756008)(3480700007)(478600001)(55016002)(30864003)(33656002)(83380400001)(8676002)(6506007)(38100700001)(7116003)(53546011)(966005)(2906002)(4326008)(7696005)(8936002)(559001)(579004); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?+nFvVLPWhkP0i8XNahfdYGlLvf2XEMDdx/Bqm0fE0XLlCXDSdVkry/BR2f?= =?iso-8859-1?Q?5KF6wmMfhSOq/gfqkGcVanIRmpou78tBqQeI1xk9sXspDn3/RtTadDreJb?= =?iso-8859-1?Q?Z3SWRGggjfVezvWAIf4ZVlsM8kB7cV4ELrAI6XVGvdu5Pb09CGZPawMcDz?= =?iso-8859-1?Q?GWzEfACv8XoQ5tny0FdXA2nsjA1EYao+qt65aRiOZmT/yI0b4wVE9HbZyL?= =?iso-8859-1?Q?NkwU3EZ4K64EFKMxq7Eob217OS8keZF175zbiO6lcJ8HSepxL4Es0tau+E?= =?iso-8859-1?Q?DbsdTeE3ADWfHE/9roX1+nE+xYF6oPuQ8TTiS1pkYNktOu3AG4UM6cFX9+?= =?iso-8859-1?Q?+rEsG8BIUpkdDgo+UOZHMX5m3GJZHk4b6rRgiLW0oj1Q4J5ZptYZJVNpad?= =?iso-8859-1?Q?KM889mGW4wvtiLx8ueeARXl2vh9kArDNHx3w528O8qsNQ9RmGXm47ArGCW?= =?iso-8859-1?Q?yyeByHOziAXN7AcM53hCeb85tclqhgsMuxcqL8PkqGuDRnYZwmYR3YGBTm?= =?iso-8859-1?Q?2nDe2bBHlQC4SHlZI8oTJw8w1f76KXcTdu+t0BCkvRdKvDp0o+7VknQ1Og?= =?iso-8859-1?Q?Fbrc51llGob0cWxTJBhN1FCTVPFzvGK2SmXEZELznKAICOqCN2x2jOC0Sb?= =?iso-8859-1?Q?6c2u5vscK1CT85SGbVd96euAjDiUHh1VwFzB28TECZY5QRNImRn/INnjtk?= =?iso-8859-1?Q?qvGIM8c3EZs3l2pU4Qor0S3d3MgDP2F/zztGrykj15SFReOczzAdnNv0k7?= =?iso-8859-1?Q?r57JLoQCa+z2N0lLvOfu7OUhwJfuu6xeUU8uXReUg7kNoiXgr1rAdeEuhf?= =?iso-8859-1?Q?xtw4poCUVKghVT9UsO773OeTt1ERFXXQZFvyPUog/hgSPHlxBLyIP8xwvS?= =?iso-8859-1?Q?ep7LQ4yGSY6qGQbpj2gVrCZV4ve1x5KFzWs+lbkGLVIeZ1qnPR6piAY4lq?= =?iso-8859-1?Q?LMdN4kdgD92FSsHRf+p2UX09BAbuwwP0pXuRAshiV+O9/+ajoSMAV1vMtM?= =?iso-8859-1?Q?7KeyglKYUpDYuDSCz9/ph1SjiFXFpa4lE1fa79Mch5Zlc6dtMD62WvSWkk?= =?iso-8859-1?Q?bgdwGBHkQqrqR5XDm+aJD9kM3szSFDfT0yRyF+2zdFzI3/tAMo4LaSqoaH?= =?iso-8859-1?Q?MlLuVuRCP1OJ/DfqP/26sqswAuuJ/eDcLPzONZFYMRBNouIONL1G8pdty3?= =?iso-8859-1?Q?gEN1+fCbuQKoyF2vbtQCUtB36/fjdizSy3lUXsza9qizd9bgnMGH3zVF9D?= =?iso-8859-1?Q?oyt2S+ohZD7cLFKmNZEvrxuQet+KJR3o6jcppDDcNbaW6lgJwbKsuyqAzs?= =?iso-8859-1?Q?0wsMYkPPjlEbBfvFL2RfVVwSiTVt7w/mZAM2pzeXzBdA9sJzrLBKjbDJK4?= =?iso-8859-1?Q?oEkf5wtc1ZKbsBl7R4thXKikgwvBypUZC+S3AxeC9T4qsWi1yhzixgMkSX?= =?iso-8859-1?Q?LET7AORSE9jM+KFD?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 8463296f-960a-40c4-6e29-08d8f7aad500 X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Apr 2021 20:47:16.1572 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: BjJwOizxBH8DrciHRcpnDZkGRqGMIeSgvrKgTDUl6v5yTuhKMGckb/1UKyLW6HJUrRaoWXAErto+2tpfEwAuLA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR0101MB1622 X-Rspamd-Queue-Id: 4FD5Sl1hjhz4RLm X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 20:47:19 -0000 Rodney W. Grimes wrote:=0A= >And I'll follow the lead to top post, as I have been quietly following=0A= >this thread, trying to only add when I think I have relevant input, and=0A= >I think I do on a small point...=0A= >=0A= >Rick,=0A= > Your "unplugging" a cable to simulate network partitioning,=0A= >in my experience this is a bad way to do that, as the host gets to=0A= >see the link layer go down and knows it can not send. =0A= I unplug the server end and normally capture packets at the=0A= Linux client end.=0A= =0A= > I am actually=0A= >puzzled that you see arp packets, but I guess those are getting=0A= >picked off before the interface layer silently tosses them on=0A= >the ground. IIRC due to this loss of link layer you may be=0A= >masking some things that would occur in other situations as often=0A= >an error is returned to the application layer. IE the ONLY packet=0A= >your likely to see into an unplugged cable is "arp".=0A= The FreeBSD server end, where I unplug, does not seem to notice=0A= at the link level (thanks to the intel net driver).=0A= I do not even see a "loss of carrier" type message when I do it.=0A= =0A= >I can suggest other means to partition, such as configuring a switch=0A= >port in and out of the correct LAN/VLAN, a physical switch in the TX=0A= >pair to open it, but leave RX pair intact so carrier is not lost.=0A= My switch is just the nat gateway the phone company provides.=0A= I can log into it with a web browser, but have only done so once=0A= in 2.5years since I got it.=0A= It is also doing very important stuff during the testing, like streaming=0A= the Mandalorian.=0A= =0A= >Both of these simulate partitioning that is more realistic, AND does=0A= >not have the side effect of allowing upper layers to eat the packets=0A= >before bpf can grab them, or be told that partitioning has occured.=0A= Well, if others feel that something like the above will be useful, I might = try.=0A= =0A= >Another side effect of unplugging a cable is that a host should=0A= >immediately invalidate all ARP entries on that interface... hence=0A= >your getting into an arp who has situation that should not even=0A= >start for 5 minutes in the other failure modes.=0A= The Linux client keeps spitting out arp queries, so that gets fixed=0A= almost instantly when the cable gets plugged back in.=0A= =0A= In general, the kernel RPC sees very little of what is going on.=0A= Sometimes a EPIPE when it tries to use a socket after it has=0A= closed down.=0A= =0A= rick=0A= =0A= Regards,=0A= Rod=0A= =0A= > Well, I'm going to cheat and top post, since this is elated info. and=0A= > not really part of the discussion...=0A= >=0A= > I've been testing network partitioning between a Linux client (5.2 kernel= )=0A= > and a FreeBSD-current NFS server. I have not gotten a solid hang, but=0A= > I have had the Linux client doing "battle" with the FreeBSD server for=0A= > several minutes after un-partitioning the connection.=0A= >=0A= > The battle basically consists of the Linux client sending an RST, followe= d=0A= > by a SYN.=0A= > The FreeBSD server ignores the RST and just replies with the same old ack= .=0A= > --> This varies from "just a SYN" that succeeds to 100+ cycles of the abo= ve=0A= > over several minutes.=0A= >=0A= > I had thought that an RST was a "pretty heavy hammer", but FreeBSD seems= =0A= > pretty good at ignoring it.=0A= >=0A= > A full packet capture of one of these is in /home/rmacklem/linuxtofreenfs= .pcap=0A= > in case anyone wants to look at it.=0A= >=0A= > Here's a tcpdump snippet of the interesting part (see the *** comments):= =0A= > 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [P.], seq 202585:202749, ack 212293, win 29128, options [nop,n= op,TS val 2073636037 ecr 2671204825], length 164: NFS reply xid 613153685 r= eply ok 160 getattr NON 4 ids 0/33554432 sz 0=0A= > 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [.], ack 202749, win 501, options [nop,nop,TS val 2671204825 e= cr 2073636037], length 0=0A= > *** Network is now partitioned...=0A= >=0A= > 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop= ,TS val 2671204927 ecr 2073636037], length 232: NFS request xid 629930901 2= 28 getattr fh 0,1/53=0A= > 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop= ,TS val 2671205135 ecr 2073636037], length 232: NFS request xid 629930901 2= 28 getattr fh 0,1/53=0A= > 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,nop= ,TS val 2671205343 ecr 2073636037], length 232: NFS request xid 629930901 2= 28 getattr fh 0,1/53=0A= > *** Lots of lines snipped.=0A= >=0A= >=0A= > 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28=0A= > 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28=0A= > 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28=0A= > 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28=0A= > 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28=0A= > *** Network is now unpartitioned...=0A= >=0A= > 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 (= oui Unknown), length 46=0A= > 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val= 2671421871 ecr 0,nop,wscale 7], length 0=0A= > 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073855137= ecr 2671204825], length 0=0A= > 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [R], seq 964161458, win 0, length 0=0A= > 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val= 2671424943 ecr 0,nop,wscale 7], length 0=0A= > 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073856161= ecr 2671204825], length 0=0A= > *** This "battle" goes on for 223sec...=0A= > I snipped out 13 cycles of this "Linux sends an RST, followed by SYN"= =0A= > "FreeBSD replies with same old ACK". In another test run I saw this= =0A= > cycle continue non-stop for several minutes. This time, the Linux=0A= > client paused for a while (see ARPs below).=0A= >=0A= > 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [R], seq 964161458, win 0, length 0=0A= > 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS val= 2671428975 ecr 0,nop,wscale 7], length 0=0A= > 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073860193= ecr 2671204825], length 0=0A= > *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST.=0A= >=0A= > 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-linu= x.home.rick, length 28=0A= > 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 (= oui Unknown), length 46=0A= > 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1.= 254, length 46=0A= > 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1.= 254, length 46=0A= > 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS v= al 2074046846 ecr 2671204825], length 0=0A= > 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS v= al 2074062406 ecr 2671204825], length 0=0A= > 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [R.], seq 202750, ack 212293, win 0, options [nop,nop,TS val 2= 074077981 ecr 2671204825], length 0=0A= > *** FreeBSD finally acknowledges the RST 38sec after Linux sent the last= =0A= > of 13 (100+ for another test run).=0A= >=0A= > 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [S], seq 4247692373, win 64240, options [mss 1460,sackOK,TS va= l 2671667055 ecr 0,nop,wscale 7], length 0=0A= > 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [S.], seq 661237469, ack 4247692374, win 65535, options [mss 1= 460,nop,wscale 6,sackOK,TS val 2074098278 ecr 2671667055], length 0=0A= > *** Now back in business...=0A= >=0A= > 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [.], ack 1, win 502, options [nop,nop,TS val 2671667055 ecr 20= 74098278], length 0=0A= > 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 1:233, ack 1, win 502, options [nop,nop,TS val 26716= 67056 ecr 2074098278], length 232: NFS request xid 629930901 228 getattr fh= 0,1/53=0A= > 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 233:505, ack 1, win 502, options [nop,nop,TS val 267= 1667056 ecr 2074098278], length 272: NFS request xid 697039765 132 getattr = fh 0,1/53=0A= > 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [.], ack 505, win 29128, options [nop,nop,TS val 2074098279 ec= r 2671667056], length 0=0A= > 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.rick= .nfsd: Flags [P.], seq 505:641, ack 1, win 502, options [nop,nop,TS val 267= 1667056 ecr 2074098279], length 136: NFS request xid 730594197 132 getattr = fh 0,1/53=0A= > 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.apex= -mesh: Flags [P.], seq 1:49, ack 505, win 29128, options [nop,nop,TS val 20= 74098279 ecr 2671667056], length 48: NFS reply xid 697039765 reply ok 44 ge= tattr ERROR: unk 10063=0A= >=0A= > This error 10063 after the partition heals is also "bad news". It indicat= es the Session=0A= > (which is supposed to maintain "exactly once" RPC semantics is broken). I= 'll admit I=0A= > suspect a Linux client bug, but will be investigating further.=0A= >=0A= > So, hopefully TCP conversant folk can confirm if the above is correct beh= aviour=0A= > or if the RST should be ack'd sooner?=0A= >=0A= > I could also see this becoming a "forever" TCP battle for other versions = of Linux client.=0A= >=0A= > rick=0A= >=0A= >=0A= > ________________________________________=0A= > From: Scheffenegger, Richard =0A= > Sent: Sunday, April 4, 2021 7:50 AM=0A= > To: Rick Macklem; tuexen@freebsd.org=0A= > Cc: Youssef GHORBAL; freebsd-net@freebsd.org=0A= > Subject: Re: NFS Mount Hangs=0A= >=0A= > CAUTION: This email originated from outside of the University of Guelph. = Do not click links or open attachments unless you recognize the sender and = know the content is safe. If in doubt, forward suspicious emails to IThelp@= uoguelph.ca=0A= >=0A= >=0A= > For what it?s worth, suse found two bugs in the linux nfconntrack (statef= ul firewall), and pfifo-fast scheduler, which could conspire to make tcp se= ssions hang forever.=0A= >=0A= > One is a missed updaten when the c?ient is not using the noresvport moint= option, which makes tje firewall think rsts are illegal (and drop them);= =0A= >=0A= > The fast scheduler can run into an issue if only a single packet should b= e forwarded (note that this is not the default scheduler, but often recomme= nded for perf, as it runs lockless and lower cpu cost that pfq (default). I= f no other/additional packet pushes out that last packet of a flow, it can = become stuck forever...=0A= >=0A= > I can try getting the relevant bug info next week...=0A= >=0A= > ________________________________=0A= > Von: owner-freebsd-net@freebsd.org im Auf= trag von Rick Macklem =0A= > Gesendet: Friday, April 2, 2021 11:31:01 PM=0A= > An: tuexen@freebsd.org =0A= > Cc: Youssef GHORBAL ; freebsd-net@freebsd.org= =0A= > Betreff: Re: NFS Mount Hangs=0A= >=0A= > NetApp Security WARNING: This is an external email. Do not click links or= open attachments unless you recognize the sender and know the content is s= afe.=0A= >=0A= >=0A= >=0A= >=0A= > tuexen@freebsd.org wrote:=0A= > >> On 2. Apr 2021, at 02:07, Rick Macklem wrote:= =0A= > >>=0A= > >> I hope you don't mind a top post...=0A= > >> I've been testing network partitioning between the only Linux client= =0A= > >> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch=0A= > >> (does soshutdown(..SHUT_WR) when it knows the socket is broken)=0A= > >> applied to it.=0A= > >>=0A= > >> I'm not enough of a TCP guy to know if this is useful, but here's what= =0A= > >> I see...=0A= > >>=0A= > >> While partitioned:=0A= > >> On the FreeBSD server end, the socket either goes to CLOSED during=0A= > >> the network partition or stays ESTABLISHED.=0A= > >If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also=0A= > >sent a FIN, but you never called close() on the socket.=0A= > >If the socket stays in ESTABLISHED, there is no communication ongoing,= =0A= > >I guess, and therefore the server does not even detect that the peer=0A= > >is not reachable.=0A= > >> On the Linux end, the socket seems to remain ESTABLISHED for a=0A= > >> little while, and then disappears.=0A= > >So how does Linux detect the peer is not reachable?=0A= > Well, here's what I see in a packet capture in the Linux client once=0A= > I partition it (just unplug the net cable):=0A= > - lots of retransmits of the same segment (with ACK) for 54sec=0A= > - then only ARP queries=0A= >=0A= > Once I plug the net cable back in:=0A= > - ARP works=0A= > - one more retransmit of the same segement=0A= > - receives RST from FreeBSD=0A= > ** So, is this now a "new" TCP connection, despite=0A= > using the same port#.=0A= > --> It matters for NFS, since "new connection"=0A= > implies "must retry all outstanding RPCs".=0A= > - sends SYN=0A= > - receives SYN, ACK from FreeBSD=0A= > --> connection starts working again=0A= > Always uses same port#.=0A= >=0A= > On the FreeBSD server end:=0A= > - receives the last retransmit of the segment (with ACK)=0A= > - sends RST=0A= > - receives SYN=0A= > - sends SYN, ACK=0A= >=0A= > I thought that there was no RST in the capture I looked at=0A= > yesterday, so I'm not sure if FreeBSD always sends an RST,=0A= > but the Linux client behaviour was the same. (Sent a SYN, etc).=0A= > The socket disappears from the Linux "netstat -a" and I=0A= > suspect that happens after about 54sec, but I am not sure=0A= > about the timing.=0A= >=0A= > >>=0A= > >> After unpartitioning:=0A= > >> On the FreeBSD server end, you get another socket showing up at=0A= > >> the same port#=0A= > >> Active Internet connections (including servers)=0A= > >> Proto Recv-Q Send-Q Local Address Foreign Address (sta= te)=0A= > >> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 ESTA= BLISHED=0A= > >> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 CLOS= ED=0A= > >>=0A= > >> The Linux client shows the same connection ESTABLISHED.=0A= > But disappears from "netstat -a" for a while during the partitioning.=0A= >=0A= > >> (The mount sometimes reports an error. I haven't looked at packet=0A= > >> traces to see if it retries RPCs or why the errors occur.)=0A= > I have now done so, as above.=0A= >=0A= > >> --> However I never get hangs.=0A= > >> Sometimes it goes to SYN_SENT for a while and the FreeBSD server=0A= > >> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the=0A= > >> mount starts working again.=0A= > >>=0A= > >> The most obvious thing is that the Linux client always keeps using=0A= > >> the same port#. (The FreeBSD client will use a different port# when=0A= > >> it does a TCP reconnect after no response from the NFS server for=0A= > >> a little while.)=0A= > >>=0A= > >> What do those TCP conversant think?=0A= > >I guess you are you are never calling close() on the socket, for with=0A= > >the connection state is CLOSED.=0A= > Ok, that makes sense. For this case the Linux client has not done a=0A= > BindConnectionToSession to re-assign the back channel.=0A= > I'll have to bug them about this. However, I'll bet they'll answer=0A= > that I have to tell them the back channel needs re-assignment=0A= > or something like that.=0A= >=0A= > I am pretty certain they are broken, in that the client needs to=0A= > retry all outstanding RPCs.=0A= >=0A= > For others, here's the long winded version of this that I just=0A= > put on the phabricator review:=0A= > In the server side kernel RPC, the socket (struct socket *) is in a=0A= > structure called SVCXPRT (normally pointed to by "xprt").=0A= > These structures a ref counted and the soclose() is done=0A= > when the ref. cnt goes to zero. My understanding is that=0A= > "struct socket *" is free'd by soclose() so this cannot be done=0A= > before the xprt ref. cnt goes to zero.=0A= >=0A= > For NFSv4.1/4.2 there is something called a back channel=0A= > which means that a "xprt" is used for server->client RPCs,=0A= > although the TCP connection is established by the client=0A= > to the server.=0A= > --> This back channel holds a ref cnt on "xprt" until the=0A= >=0A= > client re-assigns it to a different TCP connection=0A= > via an operation called BindConnectionToSession=0A= > and the Linux client is not doing this soon enough,=0A= > it appears.=0A= >=0A= > So, the soclose() is delayed, which is why I think the=0A= > TCP connection gets stuck in CLOSE_WAIT and that is=0A= > why I've added the soshutdown(..SHUT_WR) calls,=0A= > which can happen before the client gets around to=0A= > re-assigning the back channel.=0A= >=0A= > Thanks for your help with this Michael, rick=0A= >=0A= > Best regards=0A= > Michael=0A= > >=0A= > > rick=0A= > > ps: I can capture packets while doing this, if anyone has a use=0A= > > for them.=0A= > >=0A= > >=0A= > >=0A= > >=0A= > >=0A= > >=0A= > > ________________________________________=0A= > > From: owner-freebsd-net@freebsd.org on = behalf of Youssef GHORBAL =0A= > > Sent: Saturday, March 27, 2021 6:57 PM=0A= > > To: Jason Breitman=0A= > > Cc: Rick Macklem; freebsd-net@freebsd.org=0A= > > Subject: Re: NFS Mount Hangs=0A= > >=0A= > > CAUTION: This email originated from outside of the University of Guelph= . Do not click links or open attachments unless you recognize the sender an= d know the content is safe. If in doubt, forward suspicious emails to IThel= p@uoguelph.ca=0A= > >=0A= > >=0A= > >=0A= > >=0A= > > On 27 Mar 2021, at 13:20, Jason Breitman > wrote:=0A= > >=0A= > > The issue happened again so we can say that disabling TSO and LRO on th= e NIC did not resolve this issue.=0A= > > # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso= =0A= > > # ifconfig lagg0=0A= > > lagg0: flags=3D8943 met= ric 0 mtu 1500=0A= > > options=3D8100b8=0A= > >=0A= > > We can also say that the sysctl settings did not resolve this issue.=0A= > >=0A= > > # sysctl net.inet.tcp.fast_finwait2_recycle=3D1=0A= > > net.inet.tcp.fast_finwait2_recycle: 0 -> 1=0A= > >=0A= > > # sysctl net.inet.tcp.finwait2_timeout=3D1000=0A= > > net.inet.tcp.finwait2_timeout: 60000 -> 1000=0A= > >=0A= > > I don?t think those will do anything in your case since the FIN_WAIT2 a= re on the client side and those sysctls are for BSD.=0A= > > By the way it seems that Linux recycles automatically TCP sessions in F= IN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout)=0A= > >=0A= > > tcp_fin_timeout (integer; default: 60; since Linux 2.2)=0A= > > This specifies how many seconds to wait for a final FIN=0A= > > packet before the socket is forcibly closed. This is=0A= > > strictly a violation of the TCP specification, but=0A= > > required to prevent denial-of-service attacks. In Linux= =0A= > > 2.2, the default value was 180.=0A= > >=0A= > > So I don?t get why it stucks in the FIN_WAIT2 state anyway.=0A= > >=0A= > > You really need to have a packet capture during the outage (client and = server side) so you?ll get over the wire chat and start speculating from th= ere.=0A= > > No need to capture the beginning of the outage for now. All you have to= do, is run a tcpdump for 10 minutes or so when you notice a client stuck.= =0A= > >=0A= > > * I have not rebooted the NFS Server nor have I restarted nfsd, but do = not believe that is required as these settings are at the TCP level and I w= ould expect new sessions to use the updated settings.=0A= > >=0A= > > The issue occurred after 5 days following a reboot of the client machin= es.=0A= > > I ran the capture information again to make use of the situation.=0A= > >=0A= > > #!/bin/sh=0A= > >=0A= > > while true=0A= > > do=0A= > > /bin/date >> /tmp/nfs-hang.log=0A= > > /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log=0A= > > /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log=0A= > > /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log=0A= > > /bin/sleep 60=0A= > > done=0A= > >=0A= > >=0A= > > On the NFS Server=0A= > > Active Internet connections (including servers)=0A= > > Proto Recv-Q Send-Q Local Address Foreign Address (stat= e)=0A= > > tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT=0A= > >=0A= > > On the NFS Client=0A= > > tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2=0A= > >=0A= > >=0A= > >=0A= > > You had also asked for the output below.=0A= > >=0A= > > # nfsstat -E -s=0A= > > BackChannelCtBindConnToSes=0A= > > 0 0=0A= > >=0A= > > # sysctl vfs.nfsd.request_space_throttle_count=0A= > > vfs.nfsd.request_space_throttle_count: 0=0A= > >=0A= > > I see that you are testing a patch and I look forward to seeing the res= ults.=0A= > >=0A= > >=0A= > > Jason Breitman=0A= > >=0A= > >=0A= > > On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote:=0A= > >=0A= > > Youssef GHORBAL > wrote:=0A= > >> Hi Jason,=0A= > >>=0A= > >>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote:=0A= > >>>=0A= > >>> Please review the details below and let me know if there is a setting= that I should apply to my FreeBSD NFS Server or if there is a bug fix that= I can apply to resolve my issue.=0A= > >>> I shared this information with the linux-nfs mailing list and they be= lieve the issue is on the server side.=0A= > >>>=0A= > >>> Issue=0A= > >>> NFSv4 mounts periodically hang on the NFS Client.=0A= > >>>=0A= > >>> During this time, it is possible to manually mount from another NFS S= erver on the NFS Client having issues.=0A= > >>> Also, other NFS Clients are successfully mounting from the NFS Server= in question.=0A= > >>> Rebooting the NFS Client appears to be the only solution.=0A= > >>=0A= > >> I had experienced a similar weird situation with periodically stuck Li= nux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based but t= hey seem to have there >own nfsd)=0A= > > Yes, my understanding is that Isilon uses a proprietary user space nfsd= and=0A= > > not the kernel based RPC and nfsd in FreeBSD.=0A= > >=0A= > >> We?ve had better luck and we did manage to have packet captures on bot= h sides >during the issue. The gist of it goes like follows:=0A= > >>=0A= > >> - Data flows correctly between SERVER and the CLIENT=0A= > >> - At some point SERVER starts decreasing it's TCP Receive Window until= it reachs 0=0A= > >> - The client (eager to send data) can only ack data sent by SERVER.=0A= > >> - When SERVER was done sending data, the client starts sending TCP Win= dow >Probes hoping that the TCP Window opens again so he can flush its buff= ers.=0A= > >> - SERVER responds with a TCP Zero Window to those probes.=0A= > > Having the window size drop to zero is not necessarily incorrect.=0A= > > If the server is overloaded (has a backlog of NFS requests), it can sto= p doing=0A= > > soreceive() on the socket (so the socket rcv buffer can fill up and the= TCP window=0A= > > closes). This results in "backpressure" to stop the NFS client from flo= oding the=0A= > > NFS server with requests.=0A= > > --> However, once the backlog is handled, the nfsd should start to sore= ceive()=0A= > > again and this shouls cause the window to open back up.=0A= > > --> Maybe this is broken in the socket/TCP code. I quickly got lost in= =0A= > > tcp_output() when it decides what to do about the rcvwin.=0A= > >=0A= > >> - After 6 minutes (the NFS server default Idle timeout) SERVER raceful= ly closes the >TCP connection sending a FIN Packet (and still a TCP Window = 0)=0A= > > This probably does not happen for Jason's case, since the 6minute timeo= ut=0A= > > is disabled when the TCP connection is assigned as a backchannel (most = likely=0A= > > the case for NFSv4.1).=0A= > >=0A= > >> - CLIENT ACK that FIN.=0A= > >> - SERVER goes in FIN_WAIT_2 state=0A= > >> - CLIENT closes its half part part of the socket and goes in LAST_ACK = state.=0A= > >> - FIN is never sent by the client since there still data in its SendQ = and receiver TCP >Window is still 0. At this stage the client starts sendin= g TCP Window Probes again >and again hoping that the server opens its TCP W= indow so it can flush it's buffers >and terminate its side of the socket.= =0A= > >> - SERVER keeps responding with a TCP Zero Window to those probes.=0A= > >> =3D> The last two steps goes on and on for hours/days freezing the NFS= mount bound >to that TCP session.=0A= > >>=0A= > >> If we had a situation where CLIENT was responsible for closing the TCP= Window (and >initiating the TCP FIN first) and server wanting to send data= we?ll end up in the same >state as you I think.=0A= > >>=0A= > >> We?ve never had the root cause of why the SERVER decided to close the = TCP >Window and no more acccept data, the fix on the Isilon part was to rec= ycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwait2_= recycle=3D1 & >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recyc= led and at the next >occurence of CLIENT TCP Window probe, SERVER sends a R= ST, triggering the >teardown of the session on the client side, a new TCP h= andchake, etc and traffic >flows again (NFS starts responding)=0A= > >>=0A= > >> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 wa= s >implemented on the Isilon side) we?ve added a check script on the client= that detects >LAST_ACK sockets on the client and through iptables rule enf= orces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr --sp= ort $local_port -j REJECT >--reject-with tcp-reset (the script removes this= iptables rule as soon as the LAST_ACK >disappears)=0A= > >>=0A= > >> The bottom line would be to have a packet capture during the outage (c= lient and/or >server side), it will show you at least the shape of the TCP = exchange when NFS is >stuck.=0A= > > Interesting story and good work w.r.t. sluething, Youssef, thanks.=0A= > >=0A= > > I looked at Jason's log and it shows everything is ok w.r.t the nfsd th= reads.=0A= > > (They're just waiting for RPC requests.)=0A= > > However, I do now think I know why the soclose() does not happen.=0A= > > When the TCP connection is assigned as a backchannel, that takes a refe= rence=0A= > > cnt on the structure. This refcnt won't be released until the connectio= n is=0A= > > replaced by a BindConnectiotoSession operation from the client. But tha= t won't=0A= > > happen until the client creates a new TCP connection.=0A= > > --> No refcnt release-->no refcnt of 0-->no soclose().=0A= > >=0A= > > I've created the attached patch (completely different from the previous= one)=0A= > > that adds soshutdown(SHUT_WR) calls in the three places where the TCP= =0A= > > connection is going away. This seems to get it past CLOSE_WAIT without = a=0A= > > soclose().=0A= > > --> I know you are not comfortable with patching your server, but I do = think=0A= > > this change will get the socket shutdown to complete.=0A= > >=0A= > > There are a couple more things you can check on the server...=0A= > > # nfsstat -E -s=0A= > > --> Look for the count under "BindConnToSes".=0A= > > --> If non-zero, backchannels have been assigned=0A= > > # sysctl -a | fgrep request_space_throttle_count=0A= > > --> If non-zero, the server has been overloaded at some point.=0A= > >=0A= > > I think the attached patch might work around the problem.=0A= > > The code that should open up the receive window needs to be checked.=0A= > > I am also looking at enabling the 6minute timeout when a backchannel is= =0A= > > assigned.=0A= > >=0A= > > rick=0A= > >=0A= > > Youssef=0A= > >=0A= > > _______________________________________________=0A= > > freebsd-net@freebsd.org mailing list=0A= > > https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/= freebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZ= ciLj4ObR_drJNv5yryaERfIbhKR2d$=0A= > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= > > =0A= > >=0A= > > =0A= > >=0A= > > _______________________________________________=0A= > > freebsd-net@freebsd.org mailing list=0A= > > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= > > _______________________________________________=0A= > > freebsd-net@freebsd.org mailing list=0A= > > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >=0A= > _______________________________________________=0A= > freebsd-net@freebsd.org mailing list=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= > _______________________________________________=0A= > freebsd-net@freebsd.org mailing list=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= >=0A= >=0A= =0A= --=0A= Rod Grimes rgrimes@freebsd.= org=0A= _______________________________________________=0A= freebsd-net@freebsd.org mailing list=0A= https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= =0A= From owner-freebsd-net@freebsd.org Sun Apr 4 21:00:03 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 61D905B4C29 for ; Sun, 4 Apr 2021 21:00:03 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FD5lR22dwz4SFs for ; Sun, 4 Apr 2021 21:00:03 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: by mailman.nyi.freebsd.org (Postfix) id 427675B4D1A; Sun, 4 Apr 2021 21:00:03 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 421BB5B4D19 for ; Sun, 4 Apr 2021 21:00:03 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FD5lR15zCz4S4d for ; Sun, 4 Apr 2021 21:00:03 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 13D6544CB for ; Sun, 4 Apr 2021 21:00:03 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 134L03FX088871 for ; Sun, 4 Apr 2021 21:00:03 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 134L03Fe088870 for net@FreeBSD.org; Sun, 4 Apr 2021 21:00:03 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Message-Id: <202104042100.134L03Fe088870@kenobi.freebsd.org> X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@FreeBSD.org using -f From: bugzilla-noreply@FreeBSD.org To: net@FreeBSD.org Subject: Problem reports for net@FreeBSD.org that need special attention Date: Sun, 4 Apr 2021 21:00:03 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 21:00:03 -0000 To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status | Bug Id | Description ------------+-----------+--------------------------------------------------- In Progress | 221146 | [ixgbe] Problem with second laggport New | 204438 | setsockopt() handling of kern.ipc.maxsockbuf limi New | 213410 | [carp] service netif restart causes hang only whe Open | 7556 | ppp: sl_compress_init() will fail if called anyth Open | 166724 | if_re(4): watchdog timeout Open | 193452 | Dell PowerEdge 210 II -- Kernel panic bce (broadc Open | 194453 | dummynet(4): pipe config bw parameter limited to Open | 200319 | Bridge+CARP crashes/freezes Open | 202510 | [CARP] advertisements sourced from CARP IP cause Open | 207261 | netmap: Doesn't do TX sync with kqueue Open | 217978 | dhclient: Support supersede statement for option Open | 222273 | igb(4): Kernel panic (fatal trap 12) due to netwo Open | 225438 | panic in6_unlink_ifa() due to race Open | 227720 | Kernel panic in ppp server Open | 230807 | if_alc(4): Driver not working for Killer Networki Open | 236888 | ppp daemon: Allow MTU to be overridden for PPPoE Open | 236983 | bnxt(4) VLAN not operational unless explicit "ifc Open | 237072 | netgraph(4): performance issue [on HardenedBSD]? Open | 237840 | Removed dummynet dependency on ipfw Open | 238324 | Add XG-C100C/AQtion AQC107 10GbE NIC driver Open | 240944 | em(4): Crash with Intel 82571EB NIC with AMD Pile Open | 240969 | netinet6: Neighbour reachability detection broken Open | 241106 | tun/ppp: panic: vm_fault: fault on nofault entry Open | 241162 | Panic in closefp() triggered by nginx (uwsgi with Open | 243463 | ix0: Watchdog timeout Open | 244066 | divert: Add sysctls for divert socket send and re Open | 118111 | rc: network.subr Add MAC address based interface 27 problems total for which you should take action. From owner-freebsd-net@freebsd.org Sun Apr 4 22:08:48 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DF6675B7B26 for ; Sun, 4 Apr 2021 22:08:48 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FD7Gm4c2Fz4Y0s for ; Sun, 4 Apr 2021 22:08:48 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:98a3:657e:a126:66e5] (unknown [IPv6:2a02:8109:1140:c3d:98a3:657e:a126:66e5]) (Authenticated sender: macmic) by drew.franken.de (Postfix) with ESMTPSA id B106570266C87; Mon, 5 Apr 2021 00:08:40 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs From: tuexen@freebsd.org In-Reply-To: Date: Mon, 5 Apr 2021 00:08:40 +0200 Cc: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> To: Rick Macklem X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FD7Gm4c2Fz4Y0s X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 22:08:49 -0000 > On 4. Apr 2021, at 22:28, Rick Macklem wrote: >=20 > Oops, yes the packet capture is on freefall (forgot to mention = that;-). > You should be able to: > % fetch https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap >=20 > Some useful packet #s are: > 1949 - partitioning starts > 2005 - partition healed > 2060 - last RST > 2067 - SYN -> gets going again >=20 > This was taken at the Linux end. I have FreeBSD end too, although I > don't think it tells you anything more. Hi Rick, I would like to look at the FreeBSD side, too. Do you also know, what the state of the TCP connection was when the SYN / ACK / RST game was going on? I would like to understand why the reestablishment of the connection did not work... Best regards Michael >=20 > Have fun with it, rick >=20 >=20 > ________________________________________ > From: tuexen@freebsd.org > Sent: Sunday, April 4, 2021 12:41 PM > To: Rick Macklem > Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org > Subject: Re: NFS Mount Hangs >=20 > CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >=20 >=20 >> On 4. Apr 2021, at 17:27, Rick Macklem wrote: >>=20 >> Well, I'm going to cheat and top post, since this is elated info. and >> not really part of the discussion... >>=20 >> I've been testing network partitioning between a Linux client (5.2 = kernel) >> and a FreeBSD-current NFS server. I have not gotten a solid hang, but >> I have had the Linux client doing "battle" with the FreeBSD server = for >> several minutes after un-partitioning the connection. >>=20 >> The battle basically consists of the Linux client sending an RST, = followed >> by a SYN. >> The FreeBSD server ignores the RST and just replies with the same old = ack. >> --> This varies from "just a SYN" that succeeds to 100+ cycles of the = above >> over several minutes. >>=20 >> I had thought that an RST was a "pretty heavy hammer", but FreeBSD = seems >> pretty good at ignoring it. >>=20 >> A full packet capture of one of these is in = /home/rmacklem/linuxtofreenfs.pcap >> in case anyone wants to look at it. > On freefall? I would like to take a look at it... >=20 > Best regards > Michael >>=20 >> Here's a tcpdump snippet of the interesting part (see the *** = comments): >> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 202585:202749, ack = 212293, win 29128, options [nop,nop,TS val 2073636037 ecr 2671204825], = length 164: NFS reply xid 613153685 reply ok 160 getattr NON 4 ids = 0/33554432 sz 0 >> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 202749, win 501, options = [nop,nop,TS val 2671204825 ecr 2073636037], length 0 >> *** Network is now partitioned... >>=20 >> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671204927 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205135 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205343 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >> *** Lots of lines snipped. >>=20 >>=20 >> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >> *** Network is now unpartitioned... >>=20 >> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671421871 ecr 0,nop,wscale 7], length 0 >> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073855137 ecr 2671204825], length 0 >> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671424943 ecr 0,nop,wscale 7], length 0 >> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073856161 ecr 2671204825], length 0 >> *** This "battle" goes on for 223sec... >> I snipped out 13 cycles of this "Linux sends an RST, followed by = SYN" >> "FreeBSD replies with same old ACK". In another test run I saw this >> cycle continue non-stop for several minutes. This time, the Linux >> client paused for a while (see ARPs below). >>=20 >> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671428975 ecr 0,nop,wscale 7], length 0 >> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073860193 ecr 2671204825], length 0 >> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. >>=20 >> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074046846 ecr 2671204825], length 0 >> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074062406 ecr 2671204825], length 0 >> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [R.], seq 202750, ack 212293, win = 0, options [nop,nop,TS val 2074077981 ecr 2671204825], length 0 >> *** FreeBSD finally acknowledges the RST 38sec after Linux sent the = last >> of 13 (100+ for another test run). >>=20 >> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 4247692373, win 64240, options = [mss 1460,sackOK,TS val 2671667055 ecr 0,nop,wscale 7], length 0 >> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [S.], seq 661237469, ack = 4247692374, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val = 2074098278 ecr 2671667055], length 0 >> *** Now back in business... >>=20 >> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 1, win 502, options = [nop,nop,TS val 2671667055 ecr 2074098278], length 0 >> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 232: NFS = request xid 629930901 228 getattr fh 0,1/53 >> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 272: NFS = request xid 697039765 132 getattr fh 0,1/53 >> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 505, win 29128, options = [nop,nop,TS val 2074098279 ecr 2671667056], length 0 >> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098279], length 136: NFS = request xid 730594197 132 getattr fh 0,1/53 >> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 1:49, ack 505, win = 29128, options [nop,nop,TS val 2074098279 ecr 2671667056], length 48: = NFS reply xid 697039765 reply ok 44 getattr ERROR: unk 10063 >>=20 >> This error 10063 after the partition heals is also "bad news". It = indicates the Session >> (which is supposed to maintain "exactly once" RPC semantics is = broken). I'll admit I >> suspect a Linux client bug, but will be investigating further. >>=20 >> So, hopefully TCP conversant folk can confirm if the above is correct = behaviour >> or if the RST should be ack'd sooner? >>=20 >> I could also see this becoming a "forever" TCP battle for other = versions of Linux client. >>=20 >> rick >>=20 >>=20 >> ________________________________________ >> From: Scheffenegger, Richard >> Sent: Sunday, April 4, 2021 7:50 AM >> To: Rick Macklem; tuexen@freebsd.org >> Cc: Youssef GHORBAL; freebsd-net@freebsd.org >> Subject: Re: NFS Mount Hangs >>=20 >> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>=20 >>=20 >> For what it=E2=80=98s worth, suse found two bugs in the linux = nfconntrack (stateful firewall), and pfifo-fast scheduler, which could = conspire to make tcp sessions hang forever. >>=20 >> One is a missed updaten when the c=C3=B6ient is not using the = noresvport moint option, which makes tje firewall think rsts are illegal = (and drop them); >>=20 >> The fast scheduler can run into an issue if only a single packet = should be forwarded (note that this is not the default scheduler, but = often recommended for perf, as it runs lockless and lower cpu cost that = pfq (default). If no other/additional packet pushes out that last packet = of a flow, it can become stuck forever... >>=20 >> I can try getting the relevant bug info next week... >>=20 >> ________________________________ >> Von: owner-freebsd-net@freebsd.org im = Auftrag von Rick Macklem >> Gesendet: Friday, April 2, 2021 11:31:01 PM >> An: tuexen@freebsd.org >> Cc: Youssef GHORBAL ; = freebsd-net@freebsd.org >> Betreff: Re: NFS Mount Hangs >>=20 >> NetApp Security WARNING: This is an external email. Do not click = links or open attachments unless you recognize the sender and know the = content is safe. >>=20 >>=20 >>=20 >>=20 >> tuexen@freebsd.org wrote: >>>> On 2. Apr 2021, at 02:07, Rick Macklem = wrote: >>>>=20 >>>> I hope you don't mind a top post... >>>> I've been testing network partitioning between the only Linux = client >>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch >>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >>>> applied to it. >>>>=20 >>>> I'm not enough of a TCP guy to know if this is useful, but here's = what >>>> I see... >>>>=20 >>>> While partitioned: >>>> On the FreeBSD server end, the socket either goes to CLOSED during >>>> the network partition or stays ESTABLISHED. >>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer = also >>> sent a FIN, but you never called close() on the socket. >>> If the socket stays in ESTABLISHED, there is no communication = ongoing, >>> I guess, and therefore the server does not even detect that the peer >>> is not reachable. >>>> On the Linux end, the socket seems to remain ESTABLISHED for a >>>> little while, and then disappears. >>> So how does Linux detect the peer is not reachable? >> Well, here's what I see in a packet capture in the Linux client once >> I partition it (just unplug the net cable): >> - lots of retransmits of the same segment (with ACK) for 54sec >> - then only ARP queries >>=20 >> Once I plug the net cable back in: >> - ARP works >> - one more retransmit of the same segement >> - receives RST from FreeBSD >> ** So, is this now a "new" TCP connection, despite >> using the same port#. >> --> It matters for NFS, since "new connection" >> implies "must retry all outstanding RPCs". >> - sends SYN >> - receives SYN, ACK from FreeBSD >> --> connection starts working again >> Always uses same port#. >>=20 >> On the FreeBSD server end: >> - receives the last retransmit of the segment (with ACK) >> - sends RST >> - receives SYN >> - sends SYN, ACK >>=20 >> I thought that there was no RST in the capture I looked at >> yesterday, so I'm not sure if FreeBSD always sends an RST, >> but the Linux client behaviour was the same. (Sent a SYN, etc). >> The socket disappears from the Linux "netstat -a" and I >> suspect that happens after about 54sec, but I am not sure >> about the timing. >>=20 >>>>=20 >>>> After unpartitioning: >>>> On the FreeBSD server end, you get another socket showing up at >>>> the same port# >>>> Active Internet connections (including servers) >>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = ESTABLISHED >>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = CLOSED >>>>=20 >>>> The Linux client shows the same connection ESTABLISHED. >> But disappears from "netstat -a" for a while during the partitioning. >>=20 >>>> (The mount sometimes reports an error. I haven't looked at packet >>>> traces to see if it retries RPCs or why the errors occur.) >> I have now done so, as above. >>=20 >>>> --> However I never get hangs. >>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server >>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >>>> mount starts working again. >>>>=20 >>>> The most obvious thing is that the Linux client always keeps using >>>> the same port#. (The FreeBSD client will use a different port# when >>>> it does a TCP reconnect after no response from the NFS server for >>>> a little while.) >>>>=20 >>>> What do those TCP conversant think? >>> I guess you are you are never calling close() on the socket, for = with >>> the connection state is CLOSED. >> Ok, that makes sense. For this case the Linux client has not done a >> BindConnectionToSession to re-assign the back channel. >> I'll have to bug them about this. However, I'll bet they'll answer >> that I have to tell them the back channel needs re-assignment >> or something like that. >>=20 >> I am pretty certain they are broken, in that the client needs to >> retry all outstanding RPCs. >>=20 >> For others, here's the long winded version of this that I just >> put on the phabricator review: >> In the server side kernel RPC, the socket (struct socket *) is in a >> structure called SVCXPRT (normally pointed to by "xprt"). >> These structures a ref counted and the soclose() is done >> when the ref. cnt goes to zero. My understanding is that >> "struct socket *" is free'd by soclose() so this cannot be done >> before the xprt ref. cnt goes to zero. >>=20 >> For NFSv4.1/4.2 there is something called a back channel >> which means that a "xprt" is used for server->client RPCs, >> although the TCP connection is established by the client >> to the server. >> --> This back channel holds a ref cnt on "xprt" until the >>=20 >> client re-assigns it to a different TCP connection >> via an operation called BindConnectionToSession >> and the Linux client is not doing this soon enough, >> it appears. >>=20 >> So, the soclose() is delayed, which is why I think the >> TCP connection gets stuck in CLOSE_WAIT and that is >> why I've added the soshutdown(..SHUT_WR) calls, >> which can happen before the client gets around to >> re-assigning the back channel. >>=20 >> Thanks for your help with this Michael, rick >>=20 >> Best regards >> Michael >>>=20 >>> rick >>> ps: I can capture packets while doing this, if anyone has a use >>> for them. >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>> ________________________________________ >>> From: owner-freebsd-net@freebsd.org = on behalf of Youssef GHORBAL >>> Sent: Saturday, March 27, 2021 6:57 PM >>> To: Jason Breitman >>> Cc: Rick Macklem; freebsd-net@freebsd.org >>> Subject: Re: NFS Mount Hangs >>>=20 >>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>=20 >>>=20 >>>=20 >>>=20 >>> On 27 Mar 2021, at 13:20, Jason Breitman = > = wrote: >>>=20 >>> The issue happened again so we can say that disabling TSO and LRO on = the NIC did not resolve this issue. >>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso = -vlanhwtso >>> # ifconfig lagg0 >>> lagg0: flags=3D8943 = metric 0 mtu 1500 >>> = options=3D8100b8 >>>=20 >>> We can also say that the sysctl settings did not resolve this issue. >>>=20 >>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 >>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1 >>>=20 >>> # sysctl net.inet.tcp.finwait2_timeout=3D1000 >>> net.inet.tcp.finwait2_timeout: 60000 -> 1000 >>>=20 >>> I don=E2=80=99t think those will do anything in your case since the = FIN_WAIT2 are on the client side and those sysctls are for BSD. >>> By the way it seems that Linux recycles automatically TCP sessions = in FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) >>>=20 >>> tcp_fin_timeout (integer; default: 60; since Linux 2.2) >>> This specifies how many seconds to wait for a final FIN >>> packet before the socket is forcibly closed. This is >>> strictly a violation of the TCP specification, but >>> required to prevent denial-of-service attacks. In Linux >>> 2.2, the default value was 180. >>>=20 >>> So I don=E2=80=99t get why it stucks in the FIN_WAIT2 state anyway. >>>=20 >>> You really need to have a packet capture during the outage (client = and server side) so you=E2=80=99ll get over the wire chat and start = speculating from there. >>> No need to capture the beginning of the outage for now. All you have = to do, is run a tcpdump for 10 minutes or so when you notice a client = stuck. >>>=20 >>> * I have not rebooted the NFS Server nor have I restarted nfsd, but = do not believe that is required as these settings are at the TCP level = and I would expect new sessions to use the updated settings. >>>=20 >>> The issue occurred after 5 days following a reboot of the client = machines. >>> I ran the capture information again to make use of the situation. >>>=20 >>> #!/bin/sh >>>=20 >>> while true >>> do >>> /bin/date >> /tmp/nfs-hang.log >>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log >>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log >>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log >>> /bin/sleep 60 >>> done >>>=20 >>>=20 >>> On the NFS Server >>> Active Internet connections (including servers) >>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT >>>=20 >>> On the NFS Client >>> tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2 >>>=20 >>>=20 >>>=20 >>> You had also asked for the output below. >>>=20 >>> # nfsstat -E -s >>> BackChannelCtBindConnToSes >>> 0 0 >>>=20 >>> # sysctl vfs.nfsd.request_space_throttle_count >>> vfs.nfsd.request_space_throttle_count: 0 >>>=20 >>> I see that you are testing a patch and I look forward to seeing the = results. >>>=20 >>>=20 >>> Jason Breitman >>>=20 >>>=20 >>> On Mar 21, 2021, at 6:21 PM, Rick Macklem = > wrote: >>>=20 >>> Youssef GHORBAL = > wrote: >>>> Hi Jason, >>>>=20 >>>>> On 17 Mar 2021, at 18:17, Jason Breitman = > = wrote: >>>>>=20 >>>>> Please review the details below and let me know if there is a = setting that I should apply to my FreeBSD NFS Server or if there is a = bug fix that I can apply to resolve my issue. >>>>> I shared this information with the linux-nfs mailing list and they = believe the issue is on the server side. >>>>>=20 >>>>> Issue >>>>> NFSv4 mounts periodically hang on the NFS Client. >>>>>=20 >>>>> During this time, it is possible to manually mount from another = NFS Server on the NFS Client having issues. >>>>> Also, other NFS Clients are successfully mounting from the NFS = Server in question. >>>>> Rebooting the NFS Client appears to be the only solution. >>>>=20 >>>> I had experienced a similar weird situation with periodically stuck = Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based = but they seem to have there >own nfsd) >>> Yes, my understanding is that Isilon uses a proprietary user space = nfsd and >>> not the kernel based RPC and nfsd in FreeBSD. >>>=20 >>>> We=E2=80=99ve had better luck and we did manage to have packet = captures on both sides >during the issue. The gist of it goes like = follows: >>>>=20 >>>> - Data flows correctly between SERVER and the CLIENT >>>> - At some point SERVER starts decreasing it's TCP Receive Window = until it reachs 0 >>>> - The client (eager to send data) can only ack data sent by SERVER. >>>> - When SERVER was done sending data, the client starts sending TCP = Window >Probes hoping that the TCP Window opens again so he can flush = its buffers. >>>> - SERVER responds with a TCP Zero Window to those probes. >>> Having the window size drop to zero is not necessarily incorrect. >>> If the server is overloaded (has a backlog of NFS requests), it can = stop doing >>> soreceive() on the socket (so the socket rcv buffer can fill up and = the TCP window >>> closes). This results in "backpressure" to stop the NFS client from = flooding the >>> NFS server with requests. >>> --> However, once the backlog is handled, the nfsd should start to = soreceive() >>> again and this shouls cause the window to open back up. >>> --> Maybe this is broken in the socket/TCP code. I quickly got lost = in >>> tcp_output() when it decides what to do about the rcvwin. >>>=20 >>>> - After 6 minutes (the NFS server default Idle timeout) SERVER = racefully closes the >TCP connection sending a FIN Packet (and still a = TCP Window 0) >>> This probably does not happen for Jason's case, since the 6minute = timeout >>> is disabled when the TCP connection is assigned as a backchannel = (most likely >>> the case for NFSv4.1). >>>=20 >>>> - CLIENT ACK that FIN. >>>> - SERVER goes in FIN_WAIT_2 state >>>> - CLIENT closes its half part part of the socket and goes in = LAST_ACK state. >>>> - FIN is never sent by the client since there still data in its = SendQ and receiver TCP >Window is still 0. At this stage the client = starts sending TCP Window Probes again >and again hoping that the server = opens its TCP Window so it can flush it's buffers >and terminate its = side of the socket. >>>> - SERVER keeps responding with a TCP Zero Window to those probes. >>>> =3D> The last two steps goes on and on for hours/days freezing the = NFS mount bound >to that TCP session. >>>>=20 >>>> If we had a situation where CLIENT was responsible for closing the = TCP Window (and >initiating the TCP FIN first) and server wanting to = send data we=E2=80=99ll end up in the same >state as you I think. >>>>=20 >>>> We=E2=80=99ve never had the root cause of why the SERVER decided to = close the TCP >Window and no more acccept data, the fix on the Isilon = part was to recycle more >aggressively the FIN_WAIT_2 sockets = (net.inet.tcp.fast_finwait2_recycle=3D1 & = >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recycled and at = the next >occurence of CLIENT TCP Window probe, SERVER sends a RST, = triggering the >teardown of the session on the client side, a new TCP = handchake, etc and traffic >flows again (NFS starts responding) >>>>=20 >>>> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 = was >implemented on the Isilon side) we=E2=80=99ve added a check script = on the client that detects >LAST_ACK sockets on the client and through = iptables rule enforces a TCP RST, >Something like: -A OUTPUT -p tcp -d = $nfs_server_addr --sport $local_port -j REJECT >--reject-with tcp-reset = (the script removes this iptables rule as soon as the LAST_ACK = >disappears) >>>>=20 >>>> The bottom line would be to have a packet capture during the outage = (client and/or >server side), it will show you at least the shape of the = TCP exchange when NFS is >stuck. >>> Interesting story and good work w.r.t. sluething, Youssef, thanks. >>>=20 >>> I looked at Jason's log and it shows everything is ok w.r.t the nfsd = threads. >>> (They're just waiting for RPC requests.) >>> However, I do now think I know why the soclose() does not happen. >>> When the TCP connection is assigned as a backchannel, that takes a = reference >>> cnt on the structure. This refcnt won't be released until the = connection is >>> replaced by a BindConnectiotoSession operation from the client. But = that won't >>> happen until the client creates a new TCP connection. >>> --> No refcnt release-->no refcnt of 0-->no soclose(). >>>=20 >>> I've created the attached patch (completely different from the = previous one) >>> that adds soshutdown(SHUT_WR) calls in the three places where the = TCP >>> connection is going away. This seems to get it past CLOSE_WAIT = without a >>> soclose(). >>> --> I know you are not comfortable with patching your server, but I = do think >>> this change will get the socket shutdown to complete. >>>=20 >>> There are a couple more things you can check on the server... >>> # nfsstat -E -s >>> --> Look for the count under "BindConnToSes". >>> --> If non-zero, backchannels have been assigned >>> # sysctl -a | fgrep request_space_throttle_count >>> --> If non-zero, the server has been overloaded at some point. >>>=20 >>> I think the attached patch might work around the problem. >>> The code that should open up the receive window needs to be checked. >>> I am also looking at enabling the 6minute timeout when a backchannel = is >>> assigned. >>>=20 >>> rick >>>=20 >>> Youssef >>>=20 >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> = https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/fre= ebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZci= Lj4ObR_drJNv5yryaERfIbhKR2d$ >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>> >>>=20 >>> >>>=20 >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >=20 From owner-freebsd-net@freebsd.org Sun Apr 4 22:28:25 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5AF3A5B87BF for ; Sun, 4 Apr 2021 22:28:25 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4FD7jP16Ssz4ZwC for ; Sun, 4 Apr 2021 22:28:25 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 231925B87BE; Sun, 4 Apr 2021 22:28:25 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 1E8075B88D9 for ; Sun, 4 Apr 2021 22:28:25 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FD7jP075Yz4ZyV for ; Sun, 4 Apr 2021 22:28:25 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id D7A075ADC for ; Sun, 4 Apr 2021 22:28:24 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 134MSOM9036959 for ; Sun, 4 Apr 2021 22:28:24 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 134MSO5m036958 for net@FreeBSD.org; Sun, 4 Apr 2021 22:28:24 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 238324] Add XG-C100C/AQtion AQC107 10GbE NIC driver Date: Sun, 04 Apr 2021 22:28:24 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: feature, needs-patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: dutchman01@quicknet.nl X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: net@FreeBSD.org X-Bugzilla-Flags: mfc-stable12? mfc-stable11? X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 22:28:25 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D238324 Dutchman01 changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dutchman01@quicknet.nl --- Comment #16 from Dutchman01 --- Driver is also used stable in XigmaNAS for some months. also support added for AQC100 https://sourceforge.net/p/xigmanas/code/HEAD/tree/trunk/build/kernel-patche= s/aquantia-atlantic-kmod/files/patch-aqtion__aq_main.c --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-net@freebsd.org Sun Apr 4 23:12:53 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DAF095B98AF for ; Sun, 4 Apr 2021 23:12:53 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-eopbgr670043.outbound.protection.outlook.com [40.107.67.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FD8hh4Lqjz4d8j; Sun, 4 Apr 2021 23:12:51 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=U/cWqIwcWlhHGznSQfena3NaAQS1U+0/38lSQqpgHBoAnqZoBWqeLMbNMNxKcz84E1UTGGs9pGQQT7t4C1SOM3/YqDi0ZicL1TaxvmmujIr12ZNXA9dkLzVGW3HomgR4vbemM5hKraNi7VszvHXLvHfS6bCYQoIqIzJKLkqSVHRkq/j85S6uHWMz7HZ/e1+6dBNg5i8N3DRSKYNQUdujwW4yjCv1ixQe/RLZiG22zvqyCPYOrGEotOg7uLE2iqimTKIHDugmjZHSAXYYOwK4TNgvJmTpuJxIcEopxXLwxSNHMoaqH128T0pFsBg4s/5Xk1hwp+13iZAgMuOCtw+R8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NPoKn3P96NkgaTe6/ne0BKXN1G23zBS7XqYwnqr13No=; b=QQ87kzueC4z3v04o6+p+ucHERUBiMz8l8ltSA5y6vKfUuPskaFhjP4Pa0p0uIwf6zIVfuSi5qhO1/U442YRxrErz8JQv3v6qNerx1qn3/fpny2whT2r5tc0C31Vb8zPcHU/pV1TMdjSGRo3eco7tb0iKFfZjFZDA16TFzdnAoedUFzwkOSLLeW19TLegnK7jZGcVKvyVWf78Pp/APLJ9Hv3gYL+HuihU1fh8VqHFxS4OrQskouDjyrnXW4OCEz80YWljTcBVFMfxxQR4dWT5PIIRR3PbEMb188qb1W565R8xgaQy1Hes9bkSogqMtNezGF6xQXueXtakniHY8vLe4A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NPoKn3P96NkgaTe6/ne0BKXN1G23zBS7XqYwnqr13No=; b=EnLFwezBVCEM2TBWda0ngfVJ2eQzIgonl9vW6XH8rOlxKjSUJvwY0gtqnbaWxsRo+6bEhIzkfER5Qv+NDR/S+ZJgSMgEnDb/RDLbbetHTEEEgxATF0fILuakTgVpIUpH3D0tXurJwx/s08BYCWU9DcJEgXlUDeow17G+bZ4oDeHwKFG4biCihx+EHrRqlh/z3hD2/sk9hbluw5Ts4d+9t/e4Sdx4vaZ5VlQQ6ahkr6hQGGKy24aVvTXRSx7RG1/gBaIpjxkRijwaaBN+aNlkhZ1H/aOxSS4cw6C3T6gvGz5MmNW6/w7lb7afuKkcqsOEOeKuG4Eu6uwPjdcfg+KBgQ== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQBPR0101MB4401.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c01:13::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.32; Sun, 4 Apr 2021 23:12:49 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.032; Sun, 4 Apr 2021 23:12:49 +0000 From: Rick Macklem To: "tuexen@freebsd.org" CC: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAG5EAgAA+DUKAAB1JAIAACEqk Date: Sun, 4 Apr 2021 23:12:49 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> , <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> In-Reply-To: <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 1ce9112d-f83b-4ee0-24ce-08d8f7bf2a82 x-ms-traffictypediagnostic: YQBPR0101MB4401: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:226; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 5hXZjz27Dof0HH+6NtIuTV0qNNd8ZPo4zC5xejeGizt/NxywsfN6VwaCW747pgpAeLkSvxGKzeGiW7dBQbUEdDUirAOTMDiKetvuJPma04Bzi5pSIYfTuVEDGjKkdqnG6iFhZyiclR+utr6jS9l6eHaLSFdM1X0whA9lXQ9EURpc3G+IsWckzNFLoD4J6Fl1ZAg5EO1CrZ8/DjR5cHrYc8VE0iOI9HtywEF/wWN2o3wa/6bh3lB9JVxo6CN41WEXtcIMWO+ep03VENt8OsdWW46k7eFsEsNAfbgm8qtD7cgpxnzW3Wpxer2FHGhgRiPbT9xCSHopzt9MiRFYrUSZzPzKw8kQlh3xIwqN15jARamMZ7XsOmsWs+n7jhsdDXqrPFuX4UC5w/agipizybuIdhBEF9yO7t9oVoiqiKM9YHS+JelfJLxuVO4aedVUPVWORQD+9KfP2CQe/rUXW2uo5DLCPBNB8O6THm4h92t+XuN9lrHMsJ9EOv4jQBQ0vLgUjAvd5G6NfnfIag8OdmVkIR3KeiIY/HGBWH9UW7m9CdUnXwHZWyOREwwZLlUaLTVzQUbi3Vze0F/0dZ3WYEDyN7m4qL1QyzBq2c17elys5Y4ksA6xw3+fk4YRYZWCIGiP2167uevfSd//sovOjCjF/in6K2rLhIRNHYxV2TcrROiGwH+W+IAobfe/Nr6L4GuIEXv+Uh5YlBw0qTPrXHDF+zhHN3fm+Dhyo6HXBAsWMCg= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(376002)(136003)(366004)(396003)(39850400004)(346002)(786003)(316002)(4326008)(478600001)(966005)(186003)(6506007)(7696005)(30864003)(53546011)(55016002)(66574015)(3480700007)(54906003)(6916009)(52536014)(83380400001)(66946007)(66446008)(64756008)(66556008)(66476007)(76116006)(71200400001)(91956017)(9686003)(7116003)(38100700001)(33656002)(2906002)(8936002)(5660300002)(86362001)(8676002)(579004)(559001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?AjbHFhknPYHjvLypqwcRmMv/ba1mQYkx0UloF3kC3iIFtsY1WJQhapXu?= =?Windows-1252?Q?aly8tb+wpmrLY7UZFVa2qzz30Vw2tGTz5xS41DbwJdd8S7UlduT5Za5I?= =?Windows-1252?Q?WdR4eUBQ3ydhpSAh+/2+AOGYao4kPBui4uXxzo5Sec78F69YevmOVFm/?= =?Windows-1252?Q?IDsiILx24EVvdbnvGcQqLZlqCYD+5g+0si9XT6VQSRW48SgSu+kkYbZM?= =?Windows-1252?Q?IFASqt0PvmCPtB5IPlodddw/ZnHGvYayPrYm6Eu1MG0+vakS1ELOTCnS?= =?Windows-1252?Q?6HuNW8gCtbT168YxWP2wauv09jJfzMzrLLDE/P+5lU4aVQ7ifJZ+Karw?= =?Windows-1252?Q?h7s4IRie6eexmnIYX0UOZy7qaHQqMSnOGkvoQR7PQ+/7AlkPouHLKUvY?= =?Windows-1252?Q?rrIbZhcMudZuibcqkkbOL/DysEn0Zv14Hzff6dcUkDnTWwo3wtnCeoH7?= =?Windows-1252?Q?KZyNjAbDWdNjUxE57Et4C2tyTfBiCvNlSMIDSolHAcP+8waOmt2MMzBe?= =?Windows-1252?Q?QiUNYwe7PWewVWVXqydBnSV1e8qoyZcgMnJhBeaWuXjc5Kf7XZ2kBkYQ?= =?Windows-1252?Q?yYnM8L5vRE0LKGlahr1nq6cmGUj/V1JCwFOTck042rnZ+DzDFfx2S5Xf?= =?Windows-1252?Q?L/FLBE77lPJqyWHoGxvnIiGR4zi26xrC7rUWzCCt9IBwlFPhpHRlU+Eg?= =?Windows-1252?Q?KJFwgUMdjbyA5kiASbOhdhUCpzr6+zPitSibw4GvFdd5N4AfhqPsaq4w?= =?Windows-1252?Q?zoUS7rKiwzuZNs63kSf/yDZKpufd6n1BqhMgs+OqS6py42wLL9VZbr6l?= =?Windows-1252?Q?yZC+BtuYjVpOHHj2OxckBBV7LFP9Pm0liCx7h6AznzWsZ4T9s+3CDf9K?= =?Windows-1252?Q?92gNrJ6UFLwSfy6ojNf2IP2NT6mPezALlgb4+3wn2vvXWOiqEYHKYQx6?= =?Windows-1252?Q?LnPlkeYRSI5Auw0FPxhZV00wWL7R48JppPhHoXYNKHy3kPwAVhg7R+aw?= =?Windows-1252?Q?umz8ZU/ywDseGGsEEUpMuKp9UwrIi/F4l+cIG8Wb3dtsqmdZztRyVaWK?= =?Windows-1252?Q?YWO6HlFWdxXADBgNeiEah2UOBrbgQcPGceBy6XQe4qFl/wg2i/U3zb9E?= =?Windows-1252?Q?Yh8J0xZjGGpnQwg62eRf1RNPQ3HenNGXm6dtjcOUPWvHqXxs/3qrew5W?= =?Windows-1252?Q?nguHFGkPBt8/5OEeXqxXxR7FcqIsqwrEIOtUnPNfSpFrjxGc0YleN+9U?= =?Windows-1252?Q?23JujF+9EaGBJBfx5hYig2X3sYceP/MftkrSpha/nQ21ySt7uLOuVa9T?= =?Windows-1252?Q?fl3HQvtSYlxr9B49BlxrB8wnuagZlrMvuYiKK8zKzwbVT8OGjbgVf/Da?= =?Windows-1252?Q?PPG2m9nFQ96v6ZPVzBY9wXMY8YEz98N8+R+N1bC0LmSQGqM15+8YqGZb?= =?Windows-1252?Q?9UsS908JzWhVz92pMoqupcyg9MhBqr9SLW6vAjWXfGjawCSdjI1d90za?= =?Windows-1252?Q?/qnxr7VN?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 1ce9112d-f83b-4ee0-24ce-08d8f7bf2a82 X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Apr 2021 23:12:49.5712 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: /Jko9PRlKwSdKKJzcTMoBA6cUlvSpMysKfz8qjOenSiJguTR8wJHgiHJ3he8+9QTKg6qiJb93wuW6D4Uwqciaw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB4401 X-Rspamd-Queue-Id: 4FD8hh4Lqjz4d8j X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=EnLFwezB; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.67.43 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.10 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCVD_IN_DNSWL_LOW(-0.10)[40.107.67.43:from]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.67.43:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FROM_EQ_ENVFROM(0.00)[]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; SPAMHAUS_ZRD(0.00)[40.107.67.43:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.67.43:from]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2021 23:12:54 -0000 tuexen@freebsd.org wrote:=0A= >> On 4. Apr 2021, at 22:28, Rick Macklem wrote:=0A= >>=0A= >> Oops, yes the packet capture is on freefall (forgot to mention that;-).= =0A= >> You should be able to:=0A= >> % fetch https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap=0A= >>=0A= >> Some useful packet #s are:=0A= >> 1949 - partitioning starts=0A= >> 2005 - partition healed=0A= >> 2060 - last RST=0A= >> 2067 - SYN -> gets going again=0A= >>=0A= >> This was taken at the Linux end. I have FreeBSD end too, although I=0A= >> don't think it tells you anything more.=0A= >Hi Rick,=0A= >=0A= >I would like to look at the FreeBSD side, too. =0A= fetch https://people.freebsd.org/~rmacklem/freetolinuxnfs.pcap=0A= =0A= >Do you also know, what=0A= >the state of the TCP connection was when the SYN / ACK / RST game was=0A= >going on?=0A= Just ESTABLISHED when the battle goes on.=0A= And it happens when the Send-Q is 0.=0A= (If the Send-Q is not empty, it finds its way to CLOSED.)=0A= =0A= If I wait long enough before healing the partition, it will=0A= go to FIN_WAIT_1, and then if I plug it back in, it does not=0A= do battle (at least not for long).=0A= =0A= Btw, I have one running now that seems stuck really good.=0A= It has been 20minutes since I plugged the net cable back in.=0A= (Unfortunately, I didn't have tcpdump running until after=0A= I saw it was not progressing after healing.=0A= --> There is one difference. There was a 6minute timeout=0A= enabled on the server krpc for "no activity", which is=0A= now disabled like it is for NFSv4.1 in freebsd-current.=0A= I had forgotten to re-disable it.=0A= So, when it does battle, it might have been the 6minute=0A= timeout, which would then do the soshutdown(..SHUT_WR)=0A= which kept it from getting "stuck" forever.=0A= -->This time I had to reboot the FreeBSD NFS server to=0A= get the Linux client unstuck, so this one looked a lot=0A= like what has been reported.=0A= The pcap for this one, started after the network was plugged=0A= back in and I noticed it was stuck for quite a while is here:=0A= fetch https://people.freebsd.org/~rmacklem/stuck.pcap=0A= =0A= In it, there is just a bunch of RST followed by SYN sent=0A= from client->FreeBSD and FreeBSD just keeps sending=0A= acks for the old segment back.=0A= --> It looks like FreeBSD did the "RST, ACK" after the=0A= krpc did a soshutdown(..SHUT_WR) on the socket,=0A= for the one you've been looking at.=0A= I'll test some more...=0A= =0A= >I would like to understand why the reestablishment of the connection=0A= >did not work...=0A= It is looking like it takes either a non-empty send-q or a=0A= soshutdown(..SHUT_WR) to get the FreeBSD socket=0A= out of established, where it just ignores the RSTs and=0A= SYN packets.=0A= =0A= Thanks for looking at it, rick=0A= =0A= Best regards=0A= Michael=0A= >=0A= > Have fun with it, rick=0A= >=0A= >=0A= > ________________________________________=0A= > From: tuexen@freebsd.org =0A= > Sent: Sunday, April 4, 2021 12:41 PM=0A= > To: Rick Macklem=0A= > Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org=0A= > Subject: Re: NFS Mount Hangs=0A= >=0A= > CAUTION: This email originated from outside of the University of Guelph. = Do not click links or open attachments unless you recognize the sender and = know the content is safe. If in doubt, forward suspicious emails to IThelp@= uoguelph.ca=0A= >=0A= >=0A= >> On 4. Apr 2021, at 17:27, Rick Macklem wrote:=0A= >>=0A= >> Well, I'm going to cheat and top post, since this is elated info. and=0A= >> not really part of the discussion...=0A= >>=0A= >> I've been testing network partitioning between a Linux client (5.2 kerne= l)=0A= >> and a FreeBSD-current NFS server. I have not gotten a solid hang, but=0A= >> I have had the Linux client doing "battle" with the FreeBSD server for= =0A= >> several minutes after un-partitioning the connection.=0A= >>=0A= >> The battle basically consists of the Linux client sending an RST, follow= ed=0A= >> by a SYN.=0A= >> The FreeBSD server ignores the RST and just replies with the same old ac= k.=0A= >> --> This varies from "just a SYN" that succeeds to 100+ cycles of the ab= ove=0A= >> over several minutes.=0A= >>=0A= >> I had thought that an RST was a "pretty heavy hammer", but FreeBSD seems= =0A= >> pretty good at ignoring it.=0A= >>=0A= >> A full packet capture of one of these is in /home/rmacklem/linuxtofreenf= s.pcap=0A= >> in case anyone wants to look at it.=0A= > On freefall? I would like to take a look at it...=0A= >=0A= > Best regards=0A= > Michael=0A= >>=0A= >> Here's a tcpdump snippet of the interesting part (see the *** comments):= =0A= >> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [P.], seq 202585:202749, ack 212293, win 29128, options [nop,= nop,TS val 2073636037 ecr 2671204825], length 164: NFS reply xid 613153685 = reply ok 160 getattr NON 4 ids 0/33554432 sz 0=0A= >> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [.], ack 202749, win 501, options [nop,nop,TS val 2671204825 = ecr 2073636037], length 0=0A= >> *** Network is now partitioned...=0A= >>=0A= >> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,no= p,TS val 2671204927 ecr 2073636037], length 232: NFS request xid 629930901 = 228 getattr fh 0,1/53=0A= >> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,no= p,TS val 2671205135 ecr 2073636037], length 232: NFS request xid 629930901 = 228 getattr fh 0,1/53=0A= >> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,no= p,TS val 2671205343 ecr 2073636037], length 232: NFS request xid 629930901 = 228 getattr fh 0,1/53=0A= >> *** Lots of lines snipped.=0A= >>=0A= >>=0A= >> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-lin= ux.home.rick, length 28=0A= >> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-lin= ux.home.rick, length 28=0A= >> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-lin= ux.home.rick, length 28=0A= >> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-lin= ux.home.rick, length 28=0A= >> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-lin= ux.home.rick, length 28=0A= >> *** Network is now unpartitioned...=0A= >>=0A= >> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 = (oui Unknown), length 46=0A= >> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS va= l 2671421871 ecr 0,nop,wscale 7], length 0=0A= >> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 207385513= 7 ecr 2671204825], length 0=0A= >> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS va= l 2671424943 ecr 0,nop,wscale 7], length 0=0A= >> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 207385616= 1 ecr 2671204825], length 0=0A= >> *** This "battle" goes on for 223sec...=0A= >> I snipped out 13 cycles of this "Linux sends an RST, followed by SYN"= =0A= >> "FreeBSD replies with same old ACK". In another test run I saw this=0A= >> cycle continue non-stop for several minutes. This time, the Linux=0A= >> client paused for a while (see ARPs below).=0A= >>=0A= >> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS va= l 2671428975 ecr 0,nop,wscale 7], length 0=0A= >> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 207386019= 3 ecr 2671204825], length 0=0A= >> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST.=0A= >>=0A= >> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-lin= ux.home.rick, length 28=0A= >> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72 = (oui Unknown), length 46=0A= >> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1= .254, length 46=0A= >> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.1= .254, length 46=0A= >> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS = val 2074046846 ecr 2671204825], length 0=0A= >> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS = val 2074062406 ecr 2671204825], length 0=0A= >> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [R.], seq 202750, ack 212293, win 0, options [nop,nop,TS val = 2074077981 ecr 2671204825], length 0=0A= >> *** FreeBSD finally acknowledges the RST 38sec after Linux sent the last= =0A= >> of 13 (100+ for another test run).=0A= >>=0A= >> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [S], seq 4247692373, win 64240, options [mss 1460,sackOK,TS v= al 2671667055 ecr 0,nop,wscale 7], length 0=0A= >> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [S.], seq 661237469, ack 4247692374, win 65535, options [mss = 1460,nop,wscale 6,sackOK,TS val 2074098278 ecr 2671667055], length 0=0A= >> *** Now back in business...=0A= >>=0A= >> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [.], ack 1, win 502, options [nop,nop,TS val 2671667055 ecr 2= 074098278], length 0=0A= >> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [P.], seq 1:233, ack 1, win 502, options [nop,nop,TS val 2671= 667056 ecr 2074098278], length 232: NFS request xid 629930901 228 getattr f= h 0,1/53=0A= >> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [P.], seq 233:505, ack 1, win 502, options [nop,nop,TS val 26= 71667056 ecr 2074098278], length 272: NFS request xid 697039765 132 getattr= fh 0,1/53=0A= >> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [.], ack 505, win 29128, options [nop,nop,TS val 2074098279 e= cr 2671667056], length 0=0A= >> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ric= k.nfsd: Flags [P.], seq 505:641, ack 1, win 502, options [nop,nop,TS val 26= 71667056 ecr 2074098279], length 136: NFS request xid 730594197 132 getattr= fh 0,1/53=0A= >> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ape= x-mesh: Flags [P.], seq 1:49, ack 505, win 29128, options [nop,nop,TS val 2= 074098279 ecr 2671667056], length 48: NFS reply xid 697039765 reply ok 44 g= etattr ERROR: unk 10063=0A= >>=0A= >> This error 10063 after the partition heals is also "bad news". It indica= tes the Session=0A= >> (which is supposed to maintain "exactly once" RPC semantics is broken). = I'll admit I=0A= >> suspect a Linux client bug, but will be investigating further.=0A= >>=0A= >> So, hopefully TCP conversant folk can confirm if the above is correct be= haviour=0A= >> or if the RST should be ack'd sooner?=0A= >>=0A= >> I could also see this becoming a "forever" TCP battle for other versions= of Linux client.=0A= >>=0A= >> rick=0A= >>=0A= >>=0A= >> ________________________________________=0A= >> From: Scheffenegger, Richard =0A= >> Sent: Sunday, April 4, 2021 7:50 AM=0A= >> To: Rick Macklem; tuexen@freebsd.org=0A= >> Cc: Youssef GHORBAL; freebsd-net@freebsd.org=0A= >> Subject: Re: NFS Mount Hangs=0A= >>=0A= >> CAUTION: This email originated from outside of the University of Guelph.= Do not click links or open attachments unless you recognize the sender and= know the content is safe. If in doubt, forward suspicious emails to IThelp= @uoguelph.ca=0A= >>=0A= >>=0A= >> For what it=91s worth, suse found two bugs in the linux nfconntrack (sta= teful firewall), and pfifo-fast scheduler, which could conspire to make tcp= sessions hang forever.=0A= >>=0A= >> One is a missed updaten when the c=F6ient is not using the noresvport mo= int option, which makes tje firewall think rsts are illegal (and drop them)= ;=0A= >>=0A= >> The fast scheduler can run into an issue if only a single packet should = be forwarded (note that this is not the default scheduler, but often recomm= ended for perf, as it runs lockless and lower cpu cost that pfq (default). = If no other/additional packet pushes out that last packet of a flow, it can= become stuck forever...=0A= >>=0A= >> I can try getting the relevant bug info next week...=0A= >>=0A= >> ________________________________=0A= >> Von: owner-freebsd-net@freebsd.org im Au= ftrag von Rick Macklem =0A= >> Gesendet: Friday, April 2, 2021 11:31:01 PM=0A= >> An: tuexen@freebsd.org =0A= >> Cc: Youssef GHORBAL ; freebsd-net@freebsd.or= g =0A= >> Betreff: Re: NFS Mount Hangs=0A= >>=0A= >> NetApp Security WARNING: This is an external email. Do not click links o= r open attachments unless you recognize the sender and know the content is = safe.=0A= >>=0A= >>=0A= >>=0A= >>=0A= >> tuexen@freebsd.org wrote:=0A= >>>> On 2. Apr 2021, at 02:07, Rick Macklem wrote:= =0A= >>>>=0A= >>>> I hope you don't mind a top post...=0A= >>>> I've been testing network partitioning between the only Linux client= =0A= >>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch=0A= >>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken)=0A= >>>> applied to it.=0A= >>>>=0A= >>>> I'm not enough of a TCP guy to know if this is useful, but here's what= =0A= >>>> I see...=0A= >>>>=0A= >>>> While partitioned:=0A= >>>> On the FreeBSD server end, the socket either goes to CLOSED during=0A= >>>> the network partition or stays ESTABLISHED.=0A= >>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also= =0A= >>> sent a FIN, but you never called close() on the socket.=0A= >>> If the socket stays in ESTABLISHED, there is no communication ongoing,= =0A= >>> I guess, and therefore the server does not even detect that the peer=0A= >>> is not reachable.=0A= >>>> On the Linux end, the socket seems to remain ESTABLISHED for a=0A= >>>> little while, and then disappears.=0A= >>> So how does Linux detect the peer is not reachable?=0A= >> Well, here's what I see in a packet capture in the Linux client once=0A= >> I partition it (just unplug the net cable):=0A= >> - lots of retransmits of the same segment (with ACK) for 54sec=0A= >> - then only ARP queries=0A= >>=0A= >> Once I plug the net cable back in:=0A= >> - ARP works=0A= >> - one more retransmit of the same segement=0A= >> - receives RST from FreeBSD=0A= >> ** So, is this now a "new" TCP connection, despite=0A= >> using the same port#.=0A= >> --> It matters for NFS, since "new connection"=0A= >> implies "must retry all outstanding RPCs".=0A= >> - sends SYN=0A= >> - receives SYN, ACK from FreeBSD=0A= >> --> connection starts working again=0A= >> Always uses same port#.=0A= >>=0A= >> On the FreeBSD server end:=0A= >> - receives the last retransmit of the segment (with ACK)=0A= >> - sends RST=0A= >> - receives SYN=0A= >> - sends SYN, ACK=0A= >>=0A= >> I thought that there was no RST in the capture I looked at=0A= >> yesterday, so I'm not sure if FreeBSD always sends an RST,=0A= >> but the Linux client behaviour was the same. (Sent a SYN, etc).=0A= >> The socket disappears from the Linux "netstat -a" and I=0A= >> suspect that happens after about 54sec, but I am not sure=0A= >> about the timing.=0A= >>=0A= >>>>=0A= >>>> After unpartitioning:=0A= >>>> On the FreeBSD server end, you get another socket showing up at=0A= >>>> the same port#=0A= >>>> Active Internet connections (including servers)=0A= >>>> Proto Recv-Q Send-Q Local Address Foreign Address (sta= te)=0A= >>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 ESTA= BLISHED=0A= >>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 CLOS= ED=0A= >>>>=0A= >>>> The Linux client shows the same connection ESTABLISHED.=0A= >> But disappears from "netstat -a" for a while during the partitioning.=0A= >>=0A= >>>> (The mount sometimes reports an error. I haven't looked at packet=0A= >>>> traces to see if it retries RPCs or why the errors occur.)=0A= >> I have now done so, as above.=0A= >>=0A= >>>> --> However I never get hangs.=0A= >>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server=0A= >>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the=0A= >>>> mount starts working again.=0A= >>>>=0A= >>>> The most obvious thing is that the Linux client always keeps using=0A= >>>> the same port#. (The FreeBSD client will use a different port# when=0A= >>>> it does a TCP reconnect after no response from the NFS server for=0A= >>>> a little while.)=0A= >>>>=0A= >>>> What do those TCP conversant think?=0A= >>> I guess you are you are never calling close() on the socket, for with= =0A= >>> the connection state is CLOSED.=0A= >> Ok, that makes sense. For this case the Linux client has not done a=0A= >> BindConnectionToSession to re-assign the back channel.=0A= >> I'll have to bug them about this. However, I'll bet they'll answer=0A= >> that I have to tell them the back channel needs re-assignment=0A= >> or something like that.=0A= >>=0A= >> I am pretty certain they are broken, in that the client needs to=0A= >> retry all outstanding RPCs.=0A= >>=0A= >> For others, here's the long winded version of this that I just=0A= >> put on the phabricator review:=0A= >> In the server side kernel RPC, the socket (struct socket *) is in a=0A= >> structure called SVCXPRT (normally pointed to by "xprt").=0A= >> These structures a ref counted and the soclose() is done=0A= >> when the ref. cnt goes to zero. My understanding is that=0A= >> "struct socket *" is free'd by soclose() so this cannot be done=0A= >> before the xprt ref. cnt goes to zero.=0A= >>=0A= >> For NFSv4.1/4.2 there is something called a back channel=0A= >> which means that a "xprt" is used for server->client RPCs,=0A= >> although the TCP connection is established by the client=0A= >> to the server.=0A= >> --> This back channel holds a ref cnt on "xprt" until the=0A= >>=0A= >> client re-assigns it to a different TCP connection=0A= >> via an operation called BindConnectionToSession=0A= >> and the Linux client is not doing this soon enough,=0A= >> it appears.=0A= >>=0A= >> So, the soclose() is delayed, which is why I think the=0A= >> TCP connection gets stuck in CLOSE_WAIT and that is=0A= >> why I've added the soshutdown(..SHUT_WR) calls,=0A= >> which can happen before the client gets around to=0A= >> re-assigning the back channel.=0A= >>=0A= >> Thanks for your help with this Michael, rick=0A= >>=0A= >> Best regards=0A= >> Michael=0A= >>>=0A= >>> rick=0A= >>> ps: I can capture packets while doing this, if anyone has a use=0A= >>> for them.=0A= >>>=0A= >>>=0A= >>>=0A= >>>=0A= >>>=0A= >>>=0A= >>> ________________________________________=0A= >>> From: owner-freebsd-net@freebsd.org on = behalf of Youssef GHORBAL =0A= >>> Sent: Saturday, March 27, 2021 6:57 PM=0A= >>> To: Jason Breitman=0A= >>> Cc: Rick Macklem; freebsd-net@freebsd.org=0A= >>> Subject: Re: NFS Mount Hangs=0A= >>>=0A= >>> CAUTION: This email originated from outside of the University of Guelph= . Do not click links or open attachments unless you recognize the sender an= d know the content is safe. If in doubt, forward suspicious emails to IThel= p@uoguelph.ca=0A= >>>=0A= >>>=0A= >>>=0A= >>>=0A= >>> On 27 Mar 2021, at 13:20, Jason Breitman > wrote:=0A= >>>=0A= >>> The issue happened again so we can say that disabling TSO and LRO on th= e NIC did not resolve this issue.=0A= >>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso= =0A= >>> # ifconfig lagg0=0A= >>> lagg0: flags=3D8943 met= ric 0 mtu 1500=0A= >>> options=3D8100b8=0A= >>>=0A= >>> We can also say that the sysctl settings did not resolve this issue.=0A= >>>=0A= >>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1=0A= >>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1=0A= >>>=0A= >>> # sysctl net.inet.tcp.finwait2_timeout=3D1000=0A= >>> net.inet.tcp.finwait2_timeout: 60000 -> 1000=0A= >>>=0A= >>> I don=92t think those will do anything in your case since the FIN_WAIT2= are on the client side and those sysctls are for BSD.=0A= >>> By the way it seems that Linux recycles automatically TCP sessions in F= IN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout)=0A= >>>=0A= >>> tcp_fin_timeout (integer; default: 60; since Linux 2.2)=0A= >>> This specifies how many seconds to wait for a final FIN=0A= >>> packet before the socket is forcibly closed. This is=0A= >>> strictly a violation of the TCP specification, but=0A= >>> required to prevent denial-of-service attacks. In Linux=0A= >>> 2.2, the default value was 180.=0A= >>>=0A= >>> So I don=92t get why it stucks in the FIN_WAIT2 state anyway.=0A= >>>=0A= >>> You really need to have a packet capture during the outage (client and = server side) so you=92ll get over the wire chat and start speculating from = there.=0A= >>> No need to capture the beginning of the outage for now. All you have to= do, is run a tcpdump for 10 minutes or so when you notice a client stuck.= =0A= >>>=0A= >>> * I have not rebooted the NFS Server nor have I restarted nfsd, but do = not believe that is required as these settings are at the TCP level and I w= ould expect new sessions to use the updated settings.=0A= >>>=0A= >>> The issue occurred after 5 days following a reboot of the client machin= es.=0A= >>> I ran the capture information again to make use of the situation.=0A= >>>=0A= >>> #!/bin/sh=0A= >>>=0A= >>> while true=0A= >>> do=0A= >>> /bin/date >> /tmp/nfs-hang.log=0A= >>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log=0A= >>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log=0A= >>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log=0A= >>> /bin/sleep 60=0A= >>> done=0A= >>>=0A= >>>=0A= >>> On the NFS Server=0A= >>> Active Internet connections (including servers)=0A= >>> Proto Recv-Q Send-Q Local Address Foreign Address (stat= e)=0A= >>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT=0A= >>>=0A= >>> On the NFS Client=0A= >>> tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2=0A= >>>=0A= >>>=0A= >>>=0A= >>> You had also asked for the output below.=0A= >>>=0A= >>> # nfsstat -E -s=0A= >>> BackChannelCtBindConnToSes=0A= >>> 0 0=0A= >>>=0A= >>> # sysctl vfs.nfsd.request_space_throttle_count=0A= >>> vfs.nfsd.request_space_throttle_count: 0=0A= >>>=0A= >>> I see that you are testing a patch and I look forward to seeing the res= ults.=0A= >>>=0A= >>>=0A= >>> Jason Breitman=0A= >>>=0A= >>>=0A= >>> On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote:=0A= >>>=0A= >>> Youssef GHORBAL > wrote:=0A= >>>> Hi Jason,=0A= >>>>=0A= >>>>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote:=0A= >>>>>=0A= >>>>> Please review the details below and let me know if there is a setting= that I should apply to my FreeBSD NFS Server or if there is a bug fix that= I can apply to resolve my issue.=0A= >>>>> I shared this information with the linux-nfs mailing list and they be= lieve the issue is on the server side.=0A= >>>>>=0A= >>>>> Issue=0A= >>>>> NFSv4 mounts periodically hang on the NFS Client.=0A= >>>>>=0A= >>>>> During this time, it is possible to manually mount from another NFS S= erver on the NFS Client having issues.=0A= >>>>> Also, other NFS Clients are successfully mounting from the NFS Server= in question.=0A= >>>>> Rebooting the NFS Client appears to be the only solution.=0A= >>>>=0A= >>>> I had experienced a similar weird situation with periodically stuck Li= nux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based but t= hey seem to have there >own nfsd)=0A= >>> Yes, my understanding is that Isilon uses a proprietary user space nfsd= and=0A= >>> not the kernel based RPC and nfsd in FreeBSD.=0A= >>>=0A= >>>> We=92ve had better luck and we did manage to have packet captures on b= oth sides >during the issue. The gist of it goes like follows:=0A= >>>>=0A= >>>> - Data flows correctly between SERVER and the CLIENT=0A= >>>> - At some point SERVER starts decreasing it's TCP Receive Window until= it reachs 0=0A= >>>> - The client (eager to send data) can only ack data sent by SERVER.=0A= >>>> - When SERVER was done sending data, the client starts sending TCP Win= dow >Probes hoping that the TCP Window opens again so he can flush its buff= ers.=0A= >>>> - SERVER responds with a TCP Zero Window to those probes.=0A= >>> Having the window size drop to zero is not necessarily incorrect.=0A= >>> If the server is overloaded (has a backlog of NFS requests), it can sto= p doing=0A= >>> soreceive() on the socket (so the socket rcv buffer can fill up and the= TCP window=0A= >>> closes). This results in "backpressure" to stop the NFS client from flo= oding the=0A= >>> NFS server with requests.=0A= >>> --> However, once the backlog is handled, the nfsd should start to sore= ceive()=0A= >>> again and this shouls cause the window to open back up.=0A= >>> --> Maybe this is broken in the socket/TCP code. I quickly got lost in= =0A= >>> tcp_output() when it decides what to do about the rcvwin.=0A= >>>=0A= >>>> - After 6 minutes (the NFS server default Idle timeout) SERVER raceful= ly closes the >TCP connection sending a FIN Packet (and still a TCP Window = 0)=0A= >>> This probably does not happen for Jason's case, since the 6minute timeo= ut=0A= >>> is disabled when the TCP connection is assigned as a backchannel (most = likely=0A= >>> the case for NFSv4.1).=0A= >>>=0A= >>>> - CLIENT ACK that FIN.=0A= >>>> - SERVER goes in FIN_WAIT_2 state=0A= >>>> - CLIENT closes its half part part of the socket and goes in LAST_ACK = state.=0A= >>>> - FIN is never sent by the client since there still data in its SendQ = and receiver TCP >Window is still 0. At this stage the client starts sendin= g TCP Window Probes again >and again hoping that the server opens its TCP W= indow so it can flush it's buffers >and terminate its side of the socket.= =0A= >>>> - SERVER keeps responding with a TCP Zero Window to those probes.=0A= >>>> =3D> The last two steps goes on and on for hours/days freezing the NFS= mount bound >to that TCP session.=0A= >>>>=0A= >>>> If we had a situation where CLIENT was responsible for closing the TCP= Window (and >initiating the TCP FIN first) and server wanting to send data= we=92ll end up in the same >state as you I think.=0A= >>>>=0A= >>>> We=92ve never had the root cause of why the SERVER decided to close th= e TCP >Window and no more acccept data, the fix on the Isilon part was to r= ecycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwait= 2_recycle=3D1 & >net.inet.tcp.finwait2_timeout=3D5000). Once the socket rec= ycled and at the next >occurence of CLIENT TCP Window probe, SERVER sends a= RST, triggering the >teardown of the session on the client side, a new TCP= handchake, etc and traffic >flows again (NFS starts responding)=0A= >>>>=0A= >>>> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 wa= s >implemented on the Isilon side) we=92ve added a check script on the clie= nt that detects >LAST_ACK sockets on the client and through iptables rule e= nforces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr --= sport $local_port -j REJECT >--reject-with tcp-reset (the script removes th= is iptables rule as soon as the LAST_ACK >disappears)=0A= >>>>=0A= >>>> The bottom line would be to have a packet capture during the outage (c= lient and/or >server side), it will show you at least the shape of the TCP = exchange when NFS is >stuck.=0A= >>> Interesting story and good work w.r.t. sluething, Youssef, thanks.=0A= >>>=0A= >>> I looked at Jason's log and it shows everything is ok w.r.t the nfsd th= reads.=0A= >>> (They're just waiting for RPC requests.)=0A= >>> However, I do now think I know why the soclose() does not happen.=0A= >>> When the TCP connection is assigned as a backchannel, that takes a refe= rence=0A= >>> cnt on the structure. This refcnt won't be released until the connectio= n is=0A= >>> replaced by a BindConnectiotoSession operation from the client. But tha= t won't=0A= >>> happen until the client creates a new TCP connection.=0A= >>> --> No refcnt release-->no refcnt of 0-->no soclose().=0A= >>>=0A= >>> I've created the attached patch (completely different from the previous= one)=0A= >>> that adds soshutdown(SHUT_WR) calls in the three places where the TCP= =0A= >>> connection is going away. This seems to get it past CLOSE_WAIT without = a=0A= >>> soclose().=0A= >>> --> I know you are not comfortable with patching your server, but I do = think=0A= >>> this change will get the socket shutdown to complete.=0A= >>>=0A= >>> There are a couple more things you can check on the server...=0A= >>> # nfsstat -E -s=0A= >>> --> Look for the count under "BindConnToSes".=0A= >>> --> If non-zero, backchannels have been assigned=0A= >>> # sysctl -a | fgrep request_space_throttle_count=0A= >>> --> If non-zero, the server has been overloaded at some point.=0A= >>>=0A= >>> I think the attached patch might work around the problem.=0A= >>> The code that should open up the receive window needs to be checked.=0A= >>> I am also looking at enabling the 6minute timeout when a backchannel is= =0A= >>> assigned.=0A= >>>=0A= >>> rick=0A= >>>=0A= >>> Youssef=0A= >>>=0A= >>> _______________________________________________=0A= >>> freebsd-net@freebsd.org mailing list=0A= >>> https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/= freebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZ= ciLj4ObR_drJNv5yryaERfIbhKR2d$=0A= >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= >>> =0A= >>>=0A= >>> =0A= >>>=0A= >>> _______________________________________________=0A= >>> freebsd-net@freebsd.org mailing list=0A= >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>> _______________________________________________=0A= >>> freebsd-net@freebsd.org mailing list=0A= >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>=0A= >> _______________________________________________=0A= >> freebsd-net@freebsd.org mailing list=0A= >> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >> _______________________________________________=0A= >> freebsd-net@freebsd.org mailing list=0A= >> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >=0A= =0A= From owner-freebsd-net@freebsd.org Mon Apr 5 01:48:41 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 39DDD5BD657 for ; Mon, 5 Apr 2021 01:48:41 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FDD8T0fg7z4mkj for ; Mon, 5 Apr 2021 01:48:41 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 136145BD656; Mon, 5 Apr 2021 01:48:41 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 12F145BD3AE for ; Mon, 5 Apr 2021 01:48:41 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDD8S6tGKz4mnH for ; Mon, 5 Apr 2021 01:48:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id DE32810663 for ; Mon, 5 Apr 2021 01:48:40 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1351me5w034278 for ; Mon, 5 Apr 2021 01:48:40 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1351meTq034277 for net@FreeBSD.org; Mon, 5 Apr 2021 01:48:40 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 193953] vlan(4) on LACP lagg(4) do not update if_baudrate value and thus SNMP daemons do not provide high capacity counters Date: Mon, 05 Apr 2021 01:48:40 +0000 X-Bugzilla-Reason: CC AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 10.0-RELEASE X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: koobs@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: net@FreeBSD.org X-Bugzilla-Flags: maintainer-feedback? X-Bugzilla-Changed-Fields: cc flagtypes.name Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 01:48:41 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D193953 Kubilay Kocak changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |koobs@FreeBSD.org, | |net@FreeBSD.org Flags| |maintainer-feedback?(ae@Fre | |eBSD.org) --- Comment #11 from Kubilay Kocak --- ^Triage: Can we identify who and where (commit(s), merges) resolved this is= sue, either specifically or by way of framework improvement (eg: 64bit counters)? Would be great to attribute it. --=20 You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug.= From owner-freebsd-net@freebsd.org Mon Apr 5 09:03:38 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 003175C7C38 for ; Mon, 5 Apr 2021 09:03:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4FDPpK6FWPz3k3q for ; Mon, 5 Apr 2021 09:03:37 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id D60915C79C2; Mon, 5 Apr 2021 09:03:37 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D5B755C7AFA for ; Mon, 5 Apr 2021 09:03:37 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDPpK5Tgqz3kKV for ; Mon, 5 Apr 2021 09:03:37 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A4A5A15D56 for ; Mon, 5 Apr 2021 09:03:37 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 13593bDw051677 for ; Mon, 5 Apr 2021 09:03:37 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 13593biC051676 for net@FreeBSD.org; Mon, 5 Apr 2021 09:03:37 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254725] [tcp] 13.0-RC4 crash tcp_lro Date: Mon, 05 Apr 2021 09:03:36 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 13.0-STABLE X-Bugzilla-Keywords: panic X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: rscheff@freebsd.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: net@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: short_desc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 09:03:38 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254725 Richard Scheffenegger changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|13.0-RC4 crash tcp_lro |[tcp] 13.0-RC4 crash | |tcp_lro --- Comment #14 from Richard Scheffenegger --- Extracted a more complete set of packet headers belonging to the problematic session from the privately provided core. The session is ECN-enabled At the time of the panic, SENTFIN was set Based on the Timestamp option of the incoming ACKs, serious reordering and spurious retransmissions were going on. The final packet with FIN originally has a payload of 1 byte. (TSopt val ..5625), but that is apparently lost and not received by the client. Susequently (based on TSopt val), just the FIN is retransmitted twice, with TSopt val ..5861 and ..5979 (e.g. when a transmission opportunity would be there, but no new data is available). The RTT appears to be nearly 100ms from the very last round, sRTT is averag= ed at 275ms At the panic, TSval would have been ..5988 This is for retransmitting the final payload byte, as the client only SACKed the 1st FIN retransmission.=20 However, for some reason that byte is no longer available in the send socket buffer, causing the crash. Srv -> clnt F. 9999:10000(1) //dropped Clnt -> Srv E. 1:1(0) ack -26seg (unobserved retransmission Srv->Clnt) Clnt -> Srv E. 1:1(0) ack 9999 Srv -> cnt F. 10000:10000(0) Clnt -> Srv E. 1:1(0) ack 9999 attempt to retransmit 10000:10001(1) -> crash. However, current attempts to recreate this misbehavior were unsuccessful in recreating the panic. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-net@freebsd.org Mon Apr 5 09:46:58 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 4662C5B0D2C for ; Mon, 5 Apr 2021 09:46:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4FDQmL18Qwz3mZh for ; Mon, 5 Apr 2021 09:46:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 242EF5B0CA0; Mon, 5 Apr 2021 09:46:58 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 223655B0C9E for ; Mon, 5 Apr 2021 09:46:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDQmL0BZmz3mZg for ; Mon, 5 Apr 2021 09:46:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id DD29D161FD for ; Mon, 5 Apr 2021 09:46:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1359kvim069863 for ; Mon, 5 Apr 2021 09:46:57 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1359kvaY069862 for net@FreeBSD.org; Mon, 5 Apr 2021 09:46:57 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254735] [tcp] rack and bbr panic Date: Mon, 05 Apr 2021 09:46:57 +0000 X-Bugzilla-Reason: CC AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 13.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: rozhuk.im@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: net@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 09:46:58 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254735 --- Comment #6 from rozhuk.im@gmail.com --- (In reply to Michael Tuexen from comment #5) a) yes b) I do not know now. I do tests on current more than year ago. --=20 You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug.= From owner-freebsd-net@freebsd.org Mon Apr 5 09:51:46 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2F1575B0A58; Mon, 5 Apr 2021 09:51:46 +0000 (UTC) (envelope-from rozhuk.im@gmail.com) Received: from mail-lf1-x144.google.com (mail-lf1-x144.google.com [IPv6:2a00:1450:4864:20::144]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDQss1kgmz3mfS; Mon, 5 Apr 2021 09:51:44 +0000 (UTC) (envelope-from rozhuk.im@gmail.com) Received: by mail-lf1-x144.google.com with SMTP id i26so16600092lfl.1; Mon, 05 Apr 2021 02:51:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:subject:message-id:mime-version :content-transfer-encoding; bh=5HJd66b+tBguKpKwehpoBo6NXm40H8Tu5l/HjAX7VGk=; b=e26ChcGy4p3lpX/H+F3ecGA+h/Jhv7ubMwnlAnpvapJpawksFEoQFLxJR+oOJD23f3 M07Zo8ime+aaTgR8C2+oL7XVw2J1Rs66VyPGF7WoQ0HdNutGwiTVjbKc4qSW/juzaf78 vj/R8sExZ0vL8SKoeXdUVpSCgZ5/M8XBkjodX1A2yz6PGcFpnfANG81SFw7UxL4tWyiZ BsF8Hyj+GU0hhLmSNhmXxtL2/iJm2f0Vek5xrb90EwqTh2p4EbL0SX4y5o3u+MMaXcTN VYFLmXEtZn5+LqbVhhFaEfe85UB23eFw8yGeP62Yw/OaeVDsw/WaEkyz9nY2JC5xDBdc Rp4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:subject:message-id:mime-version :content-transfer-encoding; bh=5HJd66b+tBguKpKwehpoBo6NXm40H8Tu5l/HjAX7VGk=; b=OWG9MbVuTzdP7TgEoeeMazl1ZT5ZLAw0KydFZQdF30PdiSxE4EXCFOeRPn0GhEWcBT 14zJI3Acdjb4c2+Pry5mvKNeWeDp4hKK/qY05Yr0VqZxZ1HuQZqFh2njh8HJBbck8gPP lLJ+sw9aiL/J+VXldkHn8qxYvVaokQzZGq6C7WPxXWpQqm9HmPSEmhyAAefdg+uZDMLz YXtkjK40ZOT9ZHYqhKl5p/KKckYcer0F9ayBmN+UXxjj1cEtHShkkNTAATjqbgxks3rw gMSVFVfXIUKbe+2/667r5o9jd2AsW60sbauy1LRqrhzeWb+8DT3pO7g/rNqDPvKfSb1i sWXA== X-Gm-Message-State: AOAM532IN9PI8nu2mkJXW4TriGRpxm80QQAjgdMM7JWdTXUr2vLnkL6W /KktargBElJ8zzVyJF/Q37uBLVZY3I9+Vw== X-Google-Smtp-Source: ABdhPJwOaWLQTyQQhYCYmkd9FdpRkkBevBMOECqCTBda0B9MIXTCYPMzLIlWajqgeP6XdOQ5QeARGw== X-Received: by 2002:a19:9109:: with SMTP id t9mr17752955lfd.49.1617615893069; Mon, 05 Apr 2021 02:44:53 -0700 (PDT) Received: from rimwks.local ([2001:470:1f15:3d8:d45f:d9af:2e75:d996]) by smtp.gmail.com with ESMTPSA id t11sm1754890ljk.65.2021.04.05.02.44.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Apr 2021 02:44:52 -0700 (PDT) From: Rozhuk Ivan X-Google-Original-From: Rozhuk Ivan Date: Mon, 5 Apr 2021 12:44:50 +0300 To: freebsd-current@freebsd.org, "freebsd-net" , Rozhuk Ivan Subject: TCP Connection hang - MSS again Message-ID: <20210405124450.7505b43c@rimwks.local> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; amd64-portbld-freebsd13.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4FDQss1kgmz3mfS X-Spamd-Bar: +++++++++ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=e26ChcGy; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of rozhukim@gmail.com designates 2a00:1450:4864:20::144 as permitted sender) smtp.mailfrom=rozhukim@gmail.com X-Spamd-Result: default: False [9.50 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; GREYLIST(0.00)[pass,body]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(0.00)[+ip6:2a00:1450:4000::/36]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(0.00)[gmail.com,none]; FREEMAIL_TO(0.00)[freebsd.org,gmail.com]; FROM_EQ_ENVFROM(0.00)[]; DBL_SPAM(6.50)[linkpc.net:url]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::144:from]; R_DKIM_ALLOW(0.00)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(1.00)[1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; NEURAL_SPAM_MEDIUM(1.00)[1.000]; BAD_REP_POLICIES(0.10)[]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::144:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::144:from]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-current,freebsd-net] X-Spam: Yes X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 09:51:46 -0000 Hi! TCP Connection hang then I try to open https://online.sberbank.ru/CSAFront/index.do#/ FreeBSD 13 desktop + FreeBSD 13 router (pf). http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/etc/sysctl.conf FreeBSD 13 desktop have no known problems with other websites. Only with one remonte FreeBSD 12 with same sysctl.conf and mtu 9k. If I set mtu to 1500 on desktop - issue is gone. Router pf.conf contain: scrub out on $ext_v4_if0 all random-id min-ttl 128 max-mss 1400 scrub in on $ext_v4_if0 all max-mss 1400 Android 9 and FreeBSD 12.2 have no this issue (both @WiFi). As I understand, in some cases remote host does not reply with MSS option, and host behind router continue use mss 8960, that dropped by router. (pf scrub rules disabled for this log), tcpdump | grep mss: 176.99.179.102.60903 > 194.54.14.131.443: Flags [S], cksum 0xd0a2 (correct), seq 3696980106, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 3763275954 ecr 0], length 0 176.99.179.102.60719 > 194.54.14.131.443: Flags [S], cksum 0xd796 (correct), seq 232307963, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 1963519951 ecr 0], length 0 176.99.179.102.50146 > 194.54.14.131.443: Flags [S], cksum 0x1aa9 (correct), seq 3968469659, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 3417199378 ecr 0], length 0 176.99.179.102.50646 > 194.54.14.131.443: Flags [S], cksum 0xb3ba (correct), seq 3774081696, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 1089629786 ecr 0], length 0 176.99.179.102.56843 > 194.54.14.131.443: Flags [S], cksum 0xc4dd (correct), seq 647662718, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 4054756545 ecr 0], length 0 194.54.14.131.443 > 176.99.179.102.56843: Flags [S.], cksum 0x35dd (correct), seq 186241788, ack 647662719, win 65535, options [mss 1380,nop,wscale 3,nop,nop,sackOK,nop,nop,TS val 2541298941 ecr 4054756545], length 0 176.99.179.102.65364 > 194.54.14.131.443: Flags [S], cksum 0x17a0 (correct), seq 1603248650, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 1794142451 ecr 0], length 0 176.99.179.102.59862 > 194.54.14.131.443: Flags [S], cksum 0x2736 (correct), seq 4000339086, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 4084903147 ecr 0], length 0 176.99.179.102.60915 > 194.54.14.131.443: Flags [S], cksum 0xd964 (correct), seq 95236311, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 1297197380 ecr 0], length 0 176.99.179.102.58717 > 194.54.14.131.443: Flags [S], cksum 0xf92e (correct), seq 1785704794, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 1392944917 ecr 0], length 0 194.54.14.131.443 > 176.99.179.102.58717: Flags [S.], cksum 0xe020 (correct), seq 2800465814, ack 1785704795, win 65535, options [mss 1380,nop,wscale 3,nop,nop,sackOK,nop,nop,TS val 2541366941 ecr 1392944917], length 0 176.99.179.102.53377 > 194.54.14.131.443: Flags [S], cksum 0x8fdd (correct), seq 3235103847, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 1359134165 ecr 0], length 0 Is it possible to force FreeBSD always ask with tcp mss option? Is any other other options to work around this? Full connections log: 12:06:44.205766 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 60) 176.99.179.102.54064 > 194.54.14.131.443: Flags [S], cksum 0xca3f (correct), seq 980200339, win 65535, options [mss 1400,nop,wscale 10,sackOK,TS val 1268859625 ecr 0], length 0 12:06:44.206997 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 59, id 57535, offset 0, flags [none], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [S.], cksum 0x5d05 (correct), seq 2754330417, ack 980200340, win 0, length 0 12:06:44.207126 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x5d06 (correct), seq 1, ack 1, win 65535, length 0 12:06:44.210824 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 59, id 45037, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x5d06 (correct), seq 1, ack 1, win 65535, length 0 12:06:44.211130 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 571: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 557) 176.99.179.102.54064 > 194.54.14.131.443: Flags [P.], cksum 0x7667 (correct), seq 1:518, ack 1, win 65535, length 517 12:06:44.214366 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 53, id 45320, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x5d06 (correct), seq 1, ack 518, win 65018, length 0 12:06:44.216025 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 1078: (tos 0x0, ttl 53, id 45321, offset 0, flags [DF], proto TCP (6), length 1064) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x7c79 (correct), seq 1:1025, ack 518, win 65018, length 1024 12:06:44.216109 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 1078: (tos 0x0, ttl 53, id 45322, offset 0, flags [DF], proto TCP (6), length 1064) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x2c23 (correct), seq 1025:2049, ack 518, win 65018, length 1024 12:06:44.216207 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 1078: (tos 0x0, ttl 53, id 45323, offset 0, flags [DF], proto TCP (6), length 1064) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x3633 (correct), seq 2049:3073, ack 518, win 65018, length 1024 12:06:44.216220 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x5301 (correct), seq 518, ack 2049, win 65535, length 0 12:06:44.216312 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 1078: (tos 0x0, ttl 53, id 45324, offset 0, flags [DF], proto TCP (6), length 1064) 194.54.14.131.443 > 176.99.179.102.54064: Flags [P.], cksum 0xa2d7 (correct), seq 3073:4097, ack 518, win 65018, length 1024 12:06:44.216315 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 308: (tos 0x0, ttl 53, id 45325, offset 0, flags [DF], proto TCP (6), length 294) 194.54.14.131.443 > 176.99.179.102.54064: Flags [P.], cksum 0xee3e (correct), seq 4097:4351, ack 518, win 65018, length 254 12:06:44.216429 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4b01 (correct), seq 518, ack 4097, win 65535, length 0 12:06:44.218606 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 180: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 166) 176.99.179.102.54064 > 194.54.14.131.443: Flags [P.], cksum 0x9a34 (correct), seq 518:644, ack 4351, win 65535, length 126 12:06:44.221796 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 53, id 45326, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4c08 (correct), seq 4351, ack 644, win 64892, length 0 12:06:44.222418 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 105: (tos 0x0, ttl 53, id 45327, offset 0, flags [DF], proto TCP (6), length 91) 194.54.14.131.443 > 176.99.179.102.54064: Flags [P.], cksum 0xc06a (correct), seq 4351:4402, ack 644, win 64892, length 51 12:06:44.232616 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x3b93 (correct), seq 4163, ack 4402, win 65535, length 0 12:06:54.222862 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 (correct), seq 643, ack 4402, win 65535, length 0 12:06:54.226087 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 53, id 63692, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 (correct), seq 4402, ack 644, win 64892, length 0 12:07:09.229361 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 (correct), seq 643, ack 4402, win 65535, length 0 12:07:09.232522 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 53, id 23671, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 (correct), seq 4402, ack 644, win 64892, length 0 12:07:19.236480 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 (correct), seq 643, ack 4402, win 65535, length 0 12:07:19.239595 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 53, id 38859, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 (correct), seq 4402, ack 644, win 64892, length 0 12:07:29.245054 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 (correct), seq 643, ack 4402, win 65535, length 0 12:07:29.248105 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 53, id 57047, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 (correct), seq 4402, ack 644, win 64892, length 0 12:07:39.255329 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 (correct), seq 643, ack 4402, win 65535, length 0 12:07:39.259131 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 53, id 9053, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 (correct), seq 4402, ack 644, win 64892, length 0 12:07:49.260393 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 (correct), seq 643, ack 4402, win 65535, length 0 12:07:49.263502 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 53, id 23797, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 (correct), seq 4402, ack 644, win 64892, length 0 12:07:59.272445 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 (correct), seq 643, ack 4402, win 65535, length 0 12:07:59.275522 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 53, id 39984, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 (correct), seq 4402, ack 644, win 64892, length 0 12:08:03.643990 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 176.99.179.102.54064 > 194.54.14.131.443: Flags [R.], cksum 0x3b8f (correct), seq 4163, ack 4402, win 0, length 0 On lan: 12:19:58.237255 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 60) 172.16.0.3.6890 > 194.54.14.131.443: Flags [S], cksum 0x6428 (correct), seq 1450364111, win 65535, options [mss 8960,nop,wscale 10,sackOK,TS val 4221439542 ecr 0], length 0 12:19:58.238404 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 58, id 2055, offset 0, flags [none], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [S.], cksum 0x2779 (correct), seq 3562010452, ack 1450364112, win 0, length 0 12:19:58.238460 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x277a (correct), seq 1, ack 1, win 65535, length 0 12:19:58.242210 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 58, id 59736, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x277a (correct), seq 1, ack 1, win 65535, length 0 12:19:58.242320 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 571: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 557) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0x927e (correct), seq 1:518, ack 1, win 65535, length 517 12:19:58.245317 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 52, id 31249, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x277a (correct), seq 1, ack 518, win 65018, length 0 12:19:58.246908 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 1078: (tos 0x0, ttl 52, id 31250, offset 0, flags [DF], proto TCP (6), length 1064) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x35ac (correct), seq 1:1025, ack 518, win 65018, length 1024 12:19:58.246998 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 1078: (tos 0x0, ttl 52, id 31251, offset 0, flags [DF], proto TCP (6), length 1064) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0xf696 (correct), seq 1025:2049, ack 518, win 65018, length 1024 12:19:58.247088 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x1d75 (correct), seq 518, ack 2049, win 65535, length 0 12:19:58.247092 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 1078: (tos 0x0, ttl 52, id 31252, offset 0, flags [DF], proto TCP (6), length 1064) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x00a7 (correct), seq 2049:3073, ack 518, win 65018, length 1024 12:19:58.247161 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 1078: (tos 0x0, ttl 52, id 31253, offset 0, flags [DF], proto TCP (6), length 1064) 194.54.14.131.443 > 172.16.0.3.6890: Flags [P.], cksum 0xefb1 (correct), seq 3073:4097, ack 518, win 65018, length 1024 12:19:58.247191 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 308: (tos 0x0, ttl 52, id 31254, offset 0, flags [DF], proto TCP (6), length 294) 194.54.14.131.443 > 172.16.0.3.6890: Flags [P.], cksum 0xf9d5 (correct), seq 4097:4351, ack 518, win 65018, length 254 12:19:58.247256 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x1575 (correct), seq 518, ack 4097, win 65535, length 0 12:19:58.249623 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 180: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 166) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xe897 (correct), seq 518:644, ack 4351, win 65535, length 126 12:19:58.251076 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde33 (correct), seq 644:4163, ack 4351, win 65535, length 3519 12:19:58.252582 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 52, id 31255, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x167c (correct), seq 4351, ack 644, win 64892, length 0 12:19:58.253245 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 105: (tos 0x0, ttl 52, id 31256, offset 0, flags [DF], proto TCP (6), length 91) 194.54.14.131.443 > 172.16.0.3.6890: Flags [P.], cksum 0xb7e8 (correct), seq 4351:4402, ack 644, win 64892, length 51 12:19:58.263513 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x0607 (correct), seq 4163, ack 4402, win 65535, length 0 12:19:58.581446 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:19:59.049194 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:19:59.768280 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:20:01.004359 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:20:03.268259 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:20:07.596005 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:20:08.253855 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 (correct), seq 643, ack 4402, win 65535, length 0 12:20:08.256975 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 52, id 48760, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 (correct), seq 4402, ack 644, win 64892, length 0 12:20:16.064202 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:20:18.260096 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 (correct), seq 643, ack 4402, win 65535, length 0 12:20:18.263312 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 52, id 3861, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 (correct), seq 4402, ack 644, win 64892, length 0 12:20:28.264898 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 (correct), seq 643, ack 4402, win 65535, length 0 12:20:28.268240 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 52, id 20944, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 (correct), seq 4402, ack 644, win 64892, length 0 12:20:32.784190 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:20:38.268181 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 (correct), seq 643, ack 4402, win 65535, length 0 12:20:38.271373 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 52, id 38465, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 (correct), seq 4402, ack 644, win 64892, length 0 12:20:48.275277 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 (correct), seq 643, ack 4402, win 65535, length 0 12:20:48.279208 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 52, id 56414, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 (correct), seq 4402, ack 644, win 64892, length 0 12:20:58.287132 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 40) 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 (correct), seq 643, ack 4402, win 65535, length 0 12:20:58.290278 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 52, id 10709, offset 0, flags [DF], proto TCP (6), length 40) 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 (correct), seq 4402, ack 644, win 64892, length 0 12:20:58.290453 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:21:31.555198 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:22:35.565122 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:23:39.568298 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 12:24:43.568318 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], proto TCP (6), length 3559) 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 (correct), seq 644:4163, ack 4402, win 65535, length 3519 ... From owner-freebsd-net@freebsd.org Mon Apr 5 09:59:56 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B7F565B1511; Mon, 5 Apr 2021 09:59:56 +0000 (UTC) (envelope-from rozhuk.im@gmail.com) Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDR3H65CZz3n4P; Mon, 5 Apr 2021 09:59:55 +0000 (UTC) (envelope-from rozhuk.im@gmail.com) Received: by mail-lf1-x12b.google.com with SMTP id o10so16618932lfb.9; Mon, 05 Apr 2021 02:59:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:subject:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=g3CtsJ/ZCnG0Xzyj7/OR7PbIWTu4Iesg1nSKZt1xqhY=; b=AxSKrk9Fd+gBZghlzMezLm76oS5ZZFSA4hzi+Ce8gT9Yn+ndeoDqlDA605TXO57NS0 UTaj/MqktmTbt8WXPbKiIyl7PYohUwC+QzAr/DXHUl7CTnsz5eW1XAEwOYDSmFW0VYXz 7IJgtlJ0IJZgkcMSF/OPHS7SLn/mNg/6pPnnNFUrcI1EuBFBPl+Y5L4kV9pi0Y0ch3QP qPRchV2jrX6gasutaWHypvLjAUZ7E53bK3F8+M4d6pQtKsg286tDRRZIzW9h1ZX6bTc2 XqtJKMYXzBN71bVW25Gqxe1wBHcxRWhWTMsQ0yvW1cr2uiXfYqnNz1qBTRLnEY45Pk/i EexA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=g3CtsJ/ZCnG0Xzyj7/OR7PbIWTu4Iesg1nSKZt1xqhY=; b=O19no73NbLvH7BAIx/2DmWk+pNPjIxLPDafGaWxVhadLC5RRMUu2oyrT60iRa6TB5S 3OFUzDflxVvkalGX1DeUYS2xF3lpjQi5PzWsgsVFDOiVyU4DNZDfmQDaRymK2VNt9VFt TKjYLCXHGfVfZRrdzL1VJCIb1/sLUyRQcTcpLBqn2AKLs1MRbfHDXcW0alDM0+8PWS3L HY1f6Zeywp9O7VWjfCdkj8xXpiB3O2yd5oDu5Q6R+3o2mc7xo2ADaUjOxGMDubTldIIn d26V9qIFYU0bq/wqW2UrxxCjaf/jpDIB+YGrE4Vgkl7xSty+VGf4LmFP6/xBvHYLaE5X 7LIA== X-Gm-Message-State: AOAM532B5yIkza98w6MqaxNfSzZ/6jfKB9dirMBkgviE8fkwftohpBU1 isU0eof8IODQtYhqH2EvySkVW2ASMjAubw== X-Google-Smtp-Source: ABdhPJyvotrZyulGz2UUV/oYfg1+NdNi60xZDnViSbgiHHv6bnHxnH6hLUh96Lryo9xg2nrc24ZkZA== X-Received: by 2002:ac2:434d:: with SMTP id o13mr16740076lfl.478.1617616792305; Mon, 05 Apr 2021 02:59:52 -0700 (PDT) Received: from rimwks.local ([2001:470:1f15:3d8:d45f:d9af:2e75:d996]) by smtp.gmail.com with ESMTPSA id r1sm1809979ljn.71.2021.04.05.02.59.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Apr 2021 02:59:52 -0700 (PDT) From: Rozhuk Ivan X-Google-Original-From: Rozhuk Ivan Date: Mon, 5 Apr 2021 12:59:49 +0300 To: freebsd-current@freebsd.org, "freebsd-net" , Rozhuk Ivan Subject: Re: TCP Connection hang - MSS again Message-ID: <20210405125949.5a81753e@rimwks.local> In-Reply-To: <20210405124450.7505b43c@rimwks.local> References: <20210405124450.7505b43c@rimwks.local> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; amd64-portbld-freebsd13.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4FDR3H65CZz3n4P X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=AxSKrk9F; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of rozhukim@gmail.com designates 2a00:1450:4864:20::12b as permitted sender) smtp.mailfrom=rozhukim@gmail.com X-Spamd-Result: default: False [-2.48 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; FREEMAIL_TO(0.00)[freebsd.org,gmail.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::12b:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_SPAM_SHORT(0.52)[0.522]; NEURAL_HAM_LONG(-1.00)[-1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::12b:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::12b:from]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-current,freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 09:59:56 -0000 On Mon, 5 Apr 2021 12:44:50 +0300 Rozhuk Ivan wrote: > FreeBSD 13 desktop have no known problems with other websites. > Only with one remonte FreeBSD 12 with same sysctl.conf and mtu 9k. Forgot. FreeBSD 12 reply with MSS 8960, and I fix it in PF: > scrub in on $ext_v4_if0 all max-mss 1400 so only https://online.sberbank.ru/CSAFront/index.do#/ time to time reply without mss option. From owner-freebsd-net@freebsd.org Mon Apr 5 10:24:02 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BF1595B201D; Mon, 5 Apr 2021 10:24:02 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from hz.grosbein.net (hz.grosbein.net [IPv6:2a01:4f8:c2c:26d8::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hz.grosbein.net", Issuer "hz.grosbein.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDRb54GDRz3pkj; Mon, 5 Apr 2021 10:23:59 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13:0:0:0:5]) by hz.grosbein.net (8.15.2/8.15.2) with ESMTPS id 135ANoNo079720 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 5 Apr 2021 10:23:53 GMT (envelope-from eugen@grosbein.net) X-Envelope-From: eugen@grosbein.net X-Envelope-To: freebsd-current@freebsd.org Received: from [10.58.0.10] (dadvw [10.58.0.10]) by eg.sd.rdtc.ru (8.16.1/8.16.1) with ESMTPS id 135ANigp011206 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Mon, 5 Apr 2021 17:23:44 +0700 (+07) (envelope-from eugen@grosbein.net) Subject: Re: TCP Connection hang - MSS again To: Rozhuk Ivan , freebsd-current@freebsd.org, freebsd-net References: <20210405124450.7505b43c@rimwks.local> From: Eugene Grosbein Message-ID: Date: Mon, 5 Apr 2021 17:23:39 +0700 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20210405124450.7505b43c@rimwks.local> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_00,LOCAL_FROM, NICE_REPLY_A,SPF_FAIL,SPF_HELO_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-Report: * -2.3 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * 0.0 SPF_FAIL SPF: sender does not match SPF record (fail) * [SPF failed: Please see http://www.openspf.org/Why?s=mfrom; id=eugen%40grosbein.net; ip=2a03%3A3100%3Ac%3A13%3A%3A5; r=hz.grosbein.net] * 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record * 2.6 LOCAL_FROM From my domains * -0.0 NICE_REPLY_A Looks like a legit reply (A) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on hz.grosbein.net X-Rspamd-Queue-Id: 4FDRb54GDRz3pkj X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=fail (mx1.freebsd.org: domain of eugen@grosbein.net does not designate 2a01:4f8:c2c:26d8::2 as permitted sender) smtp.mailfrom=eugen@grosbein.net X-Spamd-Result: default: False [-2.10 / 15.00]; MID_RHS_MATCH_FROM(0.00)[]; R_SPF_FAIL(1.00)[-all]; FREEFALL_USER(0.00)[eugen]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; ARC_NA(0.00)[]; DMARC_NA(0.00)[grosbein.net]; SPAMHAUS_ZRD(0.00)[2a01:4f8:c2c:26d8::2:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:4f8:c2c:26d8::2:from]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FREEMAIL_TO(0.00)[gmail.com,freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:2a01:4f8::/29, country:DE]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-net,freebsd-current] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 10:24:02 -0000 05.04.2021 16:44, Rozhuk Ivan wrote: > Is any other other options to work around this? Yes. Each entry in the routing table has "mtu" attribute limiting TCP MSS, too. You should use default route with -mtu 1500 attribute. For example, in /etc/rc.conf: defaultroute="X.X.X.X -mtu 1500" From owner-freebsd-net@freebsd.org Mon Apr 5 10:32:29 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 3FE3A5B21D8; Mon, 5 Apr 2021 10:32:29 +0000 (UTC) (envelope-from rozhuk.im@gmail.com) Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDRmr3yn2z3qhp; Mon, 5 Apr 2021 10:32:28 +0000 (UTC) (envelope-from rozhuk.im@gmail.com) Received: by mail-lj1-x22f.google.com with SMTP id z8so12177395ljm.12; Mon, 05 Apr 2021 03:32:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=S05rCnsd7smj1t3oCFEA8Z1nQiaRoIOxfzYby8BLCq8=; b=eRbHcjIbAg7sVVjp7QYCnLwHqqi0Ndh4t6ko0nG5KhmDoEjLyfDqTqMwFXag2dvq4A JEC4QN9Otv5dzLxRYCszfgyYbzPHLu/dEYOYa19+d1t6HGQOmPHpNObGTqrVicjUvrqy c3obQjzqbOeM29pc2S8cZj1SP+Sv1CMPqfDC8pCGhH4xnaxCQA3nuqMTbOVPFfN9+nHK whR+GitL2HeID2j6kuxTP87P4INPLXqdJe/WAHlsfuqgE+Yp+sMAsNvZeQnilojCl+l/ wyE5X0raO/ohG1zsGzHhWFbkgGaSLvVlVUG8STzBN+b58HUvU6SEqn3whdBskLSly62D 7NNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=S05rCnsd7smj1t3oCFEA8Z1nQiaRoIOxfzYby8BLCq8=; b=BjjEzhCVQFp15k2Q2a8j9aN+WYfp6YQmExV3VJCoViTgllOHPdNQBdPFeorlAjcy4b yL+NyS1/frjUV53q6hkwQWUD0Xz84AwO3tItlPrlfNfT2HwGPEq2RN7k1we4uQbbfTrl YktSVV88bM5xm96bQPL6ZR52CV35y+G1h7b2ltTHhQ/k+Gd1LfrlTZLWL0M25i6cmSP+ r7h/zcliD43WBP8Ro/qbpmA3VD+U4KfYzgVQS5sumUoM3B0pOlXlACFma/V0ocibHwYz E8SZMh0m9OJ7L53gntlrraeVDxaDc727vbiERugVfr/bfrvg4lrVfldm0xFdOU/nRcqF +m1Q== X-Gm-Message-State: AOAM531aNgk40dzyQrAhTpI1pnIECtrB/EKjgG1n9sQ+4b/zOxAurvM2 G0wzBZ2RbSkMaBY8ebyhEzdA7EgTbD6RIw== X-Google-Smtp-Source: ABdhPJyPBuFa+R0T5V3VVgmqX36pH2HwvOCYfW5GnKpXjz9VUsRumbXgcEMiNRb4tCBhIHrTu5fLzw== X-Received: by 2002:a2e:9b14:: with SMTP id u20mr15624095lji.463.1617618746698; Mon, 05 Apr 2021 03:32:26 -0700 (PDT) Received: from rimwks.local ([2001:470:1f15:3d8:d45f:d9af:2e75:d996]) by smtp.gmail.com with ESMTPSA id y22sm1822093ljg.32.2021.04.05.03.32.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Apr 2021 03:32:26 -0700 (PDT) From: Rozhuk Ivan X-Google-Original-From: Rozhuk Ivan Date: Mon, 5 Apr 2021 13:32:24 +0300 To: Eugene Grosbein Cc: freebsd-current@freebsd.org, freebsd-net Subject: Re: TCP Connection hang - MSS again Message-ID: <20210405133224.469bd865@rimwks.local> In-Reply-To: References: <20210405124450.7505b43c@rimwks.local> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; amd64-portbld-freebsd13.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4FDRmr3yn2z3qhp X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=eRbHcjIb; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of rozhukim@gmail.com designates 2a00:1450:4864:20::22f as permitted sender) smtp.mailfrom=rozhukim@gmail.com X-Spamd-Result: default: False [-4.00 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::22f:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::22f:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::22f:from]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-current,freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 10:32:29 -0000 On Mon, 5 Apr 2021 17:23:39 +0700 Eugene Grosbein wrote: > > Is any other other options to work around this? > > Yes. Each entry in the routing table has "mtu" attribute limiting TCP > MSS, too. You should use default route with -mtu 1500 attribute. For > example, in /etc/rc.conf: > > defaultroute="X.X.X.X -mtu 1500" > This help only if I do this on all hosts in network, this is to many management. From owner-freebsd-net@freebsd.org Mon Apr 5 11:04:32 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2E2F55B3620; Mon, 5 Apr 2021 11:04:32 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDSTq5lqsz3scR; Mon, 5 Apr 2021 11:04:31 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:98a3:657e:a126:66e5] (unknown [IPv6:2a02:8109:1140:c3d:98a3:657e:a126:66e5]) (Authenticated sender: macmic) by drew.franken.de (Postfix) with ESMTPSA id 55C4F7AA9E823; Mon, 5 Apr 2021 13:04:22 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: TCP Connection hang - MSS again From: tuexen@freebsd.org In-Reply-To: <20210405124450.7505b43c@rimwks.local> Date: Mon, 5 Apr 2021 13:04:19 +0200 Cc: freebsd-current@freebsd.org, freebsd-net Content-Transfer-Encoding: quoted-printable Message-Id: <0D7C52FC-DA37-41B6-A05C-F49ECEFE51FC@freebsd.org> References: <20210405124450.7505b43c@rimwks.local> To: Rozhuk Ivan X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-0.4 required=5.0 tests=ALL_TRUSTED,BAYES_00, URIBL_DBL_SPAM autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FDSTq5lqsz3scR X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [0.00 / 15.00]; local_wl_from(0.00)[freebsd.org]; TAGGED_RCPT(0.00)[]; ASN(0.00)[asn:680, ipnet:193.174.0.0/15, country:DE] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 11:04:32 -0000 > On 5. Apr 2021, at 11:44, Rozhuk Ivan wrote: >=20 > Hi! >=20 >=20 > TCP Connection hang then I try to open = https://online.sberbank.ru/CSAFront/index.do#/ >=20 > FreeBSD 13 desktop + FreeBSD 13 router (pf). > = http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/etc/sys= ctl.conf > FreeBSD 13 desktop have no known problems with other websites. > Only with one remonte FreeBSD 12 with same sysctl.conf and mtu 9k. > If I set mtu to 1500 on desktop - issue is gone. >=20 > Router pf.conf contain: > scrub out on $ext_v4_if0 all random-id min-ttl 128 max-mss 1400 > scrub in on $ext_v4_if0 all max-mss 1400 >=20 >=20 > Android 9 and FreeBSD 12.2 have no this issue (both @WiFi). >=20 >=20 > As I understand, in some cases remote host does not reply with MSS = option, and host behind > router continue use mss 8960, that dropped by router. If the peer does not provide an MSS option, your local FreeBSD based = host should use an MSS of net.inet.tcp.mssdflt bytes. The default is 536. So I don't = think this should be a problem. Best regards Michael >=20 > (pf scrub rules disabled for this log), tcpdump | grep mss: > 176.99.179.102.60903 > 194.54.14.131.443: Flags [S], cksum 0xd0a2 = (correct), seq 3696980106, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 3763275954 ecr 0], length 0 > 176.99.179.102.60719 > 194.54.14.131.443: Flags [S], cksum 0xd796 = (correct), seq 232307963, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 1963519951 ecr 0], length 0 > 176.99.179.102.50146 > 194.54.14.131.443: Flags [S], cksum 0x1aa9 = (correct), seq 3968469659, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 3417199378 ecr 0], length 0 > 176.99.179.102.50646 > 194.54.14.131.443: Flags [S], cksum 0xb3ba = (correct), seq 3774081696, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 1089629786 ecr 0], length 0 > 176.99.179.102.56843 > 194.54.14.131.443: Flags [S], cksum 0xc4dd = (correct), seq 647662718, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 4054756545 ecr 0], length 0 > 194.54.14.131.443 > 176.99.179.102.56843: Flags [S.], cksum 0x35dd = (correct), seq 186241788, ack 647662719, win 65535, options [mss = 1380,nop,wscale 3,nop,nop,sackOK,nop,nop,TS val 2541298941 ecr = 4054756545], length 0 > 176.99.179.102.65364 > 194.54.14.131.443: Flags [S], cksum 0x17a0 = (correct), seq 1603248650, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 1794142451 ecr 0], length 0 > 176.99.179.102.59862 > 194.54.14.131.443: Flags [S], cksum 0x2736 = (correct), seq 4000339086, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 4084903147 ecr 0], length 0 > 176.99.179.102.60915 > 194.54.14.131.443: Flags [S], cksum 0xd964 = (correct), seq 95236311, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 1297197380 ecr 0], length 0 > 176.99.179.102.58717 > 194.54.14.131.443: Flags [S], cksum 0xf92e = (correct), seq 1785704794, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 1392944917 ecr 0], length 0 > 194.54.14.131.443 > 176.99.179.102.58717: Flags [S.], cksum 0xe020 = (correct), seq 2800465814, ack 1785704795, win 65535, options [mss = 1380,nop,wscale 3,nop,nop,sackOK,nop,nop,TS val 2541366941 ecr = 1392944917], length 0 > 176.99.179.102.53377 > 194.54.14.131.443: Flags [S], cksum 0x8fdd = (correct), seq 3235103847, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 1359134165 ecr 0], length 0 >=20 >=20 > Is it possible to force FreeBSD always ask with tcp mss option? > Is any other other options to work around this? >=20 >=20 >=20 >=20 >=20 >=20 > Full connections log: >=20 > 12:06:44.205766 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 74: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 60) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [S], cksum 0xca3f = (correct), seq 980200339, win 65535, options [mss 1400,nop,wscale = 10,sackOK,TS val 1268859625 ecr 0], length 0 > 12:06:44.206997 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 59, id 57535, offset 0, flags [none], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [S.], cksum 0x5d05 = (correct), seq 2754330417, ack 980200340, win 0, length 0 > 12:06:44.207126 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x5d06 = (correct), seq 1, ack 1, win 65535, length 0 > 12:06:44.210824 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 59, id 45037, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x5d06 = (correct), seq 1, ack 1, win 65535, length 0 > 12:06:44.211130 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 571: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 557) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [P.], cksum 0x7667 = (correct), seq 1:518, ack 1, win 65535, length 517 > 12:06:44.214366 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 53, id 45320, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x5d06 = (correct), seq 1, ack 518, win 65018, length 0 > 12:06:44.216025 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 1078: (tos 0x0, ttl 53, id 45321, offset 0, flags [DF], = proto TCP (6), length 1064) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x7c79 = (correct), seq 1:1025, ack 518, win 65018, length 1024 > 12:06:44.216109 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 1078: (tos 0x0, ttl 53, id 45322, offset 0, flags [DF], = proto TCP (6), length 1064) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x2c23 = (correct), seq 1025:2049, ack 518, win 65018, length 1024 > 12:06:44.216207 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 1078: (tos 0x0, ttl 53, id 45323, offset 0, flags [DF], = proto TCP (6), length 1064) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x3633 = (correct), seq 2049:3073, ack 518, win 65018, length 1024 > 12:06:44.216220 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x5301 = (correct), seq 518, ack 2049, win 65535, length 0 > 12:06:44.216312 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 1078: (tos 0x0, ttl 53, id 45324, offset 0, flags [DF], = proto TCP (6), length 1064) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [P.], cksum 0xa2d7 = (correct), seq 3073:4097, ack 518, win 65018, length 1024 > 12:06:44.216315 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 308: (tos 0x0, ttl 53, id 45325, offset 0, flags [DF], = proto TCP (6), length 294) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [P.], cksum 0xee3e = (correct), seq 4097:4351, ack 518, win 65018, length 254 > 12:06:44.216429 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4b01 = (correct), seq 518, ack 4097, win 65535, length 0 > 12:06:44.218606 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 180: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 166) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [P.], cksum 0x9a34 = (correct), seq 518:644, ack 4351, win 65535, length 126 > 12:06:44.221796 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 53, id 45326, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4c08 = (correct), seq 4351, ack 644, win 64892, length 0 > 12:06:44.222418 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 105: (tos 0x0, ttl 53, id 45327, offset 0, flags [DF], = proto TCP (6), length 91) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [P.], cksum 0xc06a = (correct), seq 4351:4402, ack 644, win 64892, length 51 > 12:06:44.232616 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x3b93 = (correct), seq 4163, ack 4402, win 65535, length 0 > 12:06:54.222862 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:06:54.226087 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 53, id 63692, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:07:09.229361 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:07:09.232522 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 53, id 23671, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:07:19.236480 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:07:19.239595 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 53, id 38859, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:07:29.245054 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:07:29.248105 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 53, id 57047, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:07:39.255329 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:07:39.259131 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 53, id 9053, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:07:49.260393 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:07:49.263502 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 53, id 23797, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:07:59.272445 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [.], cksum 0x4953 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:07:59.275522 78:da:6e:28:c9:c0 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 53, id 39984, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 176.99.179.102.54064: Flags [.], cksum 0x4bd5 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:08:03.643990 70:85:c2:43:67:5b > 78:da:6e:28:c9:c0, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 176.99.179.102.54064 > 194.54.14.131.443: Flags [R.], cksum 0x3b8f = (correct), seq 4163, ack 4402, win 0, length 0 >=20 > On lan: > 12:19:58.237255 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 74: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 60) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [S], cksum 0x6428 = (correct), seq 1450364111, win 65535, options [mss 8960,nop,wscale = 10,sackOK,TS val 4221439542 ecr 0], length 0 > 12:19:58.238404 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 58, id 2055, offset 0, flags [none], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [S.], cksum 0x2779 = (correct), seq 3562010452, ack 1450364112, win 0, length 0 > 12:19:58.238460 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x277a = (correct), seq 1, ack 1, win 65535, length 0 > 12:19:58.242210 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 58, id 59736, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x277a = (correct), seq 1, ack 1, win 65535, length 0 > 12:19:58.242320 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 571: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 557) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0x927e = (correct), seq 1:518, ack 1, win 65535, length 517 > 12:19:58.245317 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 52, id 31249, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x277a = (correct), seq 1, ack 518, win 65018, length 0 > 12:19:58.246908 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 1078: (tos 0x0, ttl 52, id 31250, offset 0, flags [DF], = proto TCP (6), length 1064) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x35ac = (correct), seq 1:1025, ack 518, win 65018, length 1024 > 12:19:58.246998 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 1078: (tos 0x0, ttl 52, id 31251, offset 0, flags [DF], = proto TCP (6), length 1064) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0xf696 = (correct), seq 1025:2049, ack 518, win 65018, length 1024 > 12:19:58.247088 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x1d75 = (correct), seq 518, ack 2049, win 65535, length 0 > 12:19:58.247092 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 1078: (tos 0x0, ttl 52, id 31252, offset 0, flags [DF], = proto TCP (6), length 1064) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x00a7 = (correct), seq 2049:3073, ack 518, win 65018, length 1024 > 12:19:58.247161 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 1078: (tos 0x0, ttl 52, id 31253, offset 0, flags [DF], = proto TCP (6), length 1064) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [P.], cksum 0xefb1 = (correct), seq 3073:4097, ack 518, win 65018, length 1024 > 12:19:58.247191 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 308: (tos 0x0, ttl 52, id 31254, offset 0, flags [DF], = proto TCP (6), length 294) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [P.], cksum 0xf9d5 = (correct), seq 4097:4351, ack 518, win 65018, length 254 > 12:19:58.247256 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x1575 = (correct), seq 518, ack 4097, win 65535, length 0 > 12:19:58.249623 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 180: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 166) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xe897 = (correct), seq 518:644, ack 4351, win 65535, length 126 > 12:19:58.251076 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde33 = (correct), seq 644:4163, ack 4351, win 65535, length 3519 > 12:19:58.252582 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 52, id 31255, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x167c = (correct), seq 4351, ack 644, win 64892, length 0 > 12:19:58.253245 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 105: (tos 0x0, ttl 52, id 31256, offset 0, flags [DF], = proto TCP (6), length 91) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [P.], cksum 0xb7e8 = (correct), seq 4351:4402, ack 644, win 64892, length 51 > 12:19:58.263513 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x0607 = (correct), seq 4163, ack 4402, win 65535, length 0 > 12:19:58.581446 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:19:59.049194 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:19:59.768280 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:20:01.004359 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:20:03.268259 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:20:07.596005 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:20:08.253855 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:20:08.256975 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 52, id 48760, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:20:16.064202 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:20:18.260096 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:20:18.263312 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 52, id 3861, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:20:28.264898 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:20:28.268240 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 52, id 20944, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:20:32.784190 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:20:38.268181 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:20:38.271373 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 52, id 38465, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:20:48.275277 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:20:48.279208 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 52, id 56414, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:20:58.287132 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 60: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 40) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [.], cksum 0x13c7 = (correct), seq 643, ack 4402, win 65535, length 0 > 12:20:58.290278 70:85:c2:43:67:5b > 70:85:c2:37:57:22, ethertype IPv4 = (0x0800), length 54: (tos 0x0, ttl 52, id 10709, offset 0, flags [DF], = proto TCP (6), length 40) > 194.54.14.131.443 > 172.16.0.3.6890: Flags [.], cksum 0x1649 = (correct), seq 4402, ack 644, win 64892, length 0 > 12:20:58.290453 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:21:31.555198 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:22:35.565122 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:23:39.568298 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > 12:24:43.568318 70:85:c2:37:57:22 > 70:85:c2:43:67:5b, ethertype IPv4 = (0x0800), length 3573: (tos 0x0, ttl 128, id 0, offset 0, flags [DF], = proto TCP (6), length 3559) > 172.16.0.3.6890 > 194.54.14.131.443: Flags [P.], cksum 0xde00 = (correct), seq 644:4163, ack 4402, win 65535, length 3519 > ... > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Mon Apr 5 12:44:55 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 01B245B6DE9; Mon, 5 Apr 2021 12:44:55 +0000 (UTC) (envelope-from rozhuk.im@gmail.com) Received: from mail-lj1-x232.google.com (mail-lj1-x232.google.com [IPv6:2a00:1450:4864:20::232]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDVjf0bR8z4TgS; Mon, 5 Apr 2021 12:44:53 +0000 (UTC) (envelope-from rozhuk.im@gmail.com) Received: by mail-lj1-x232.google.com with SMTP id f16so12572739ljm.1; Mon, 05 Apr 2021 05:44:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iQ1SPZi4ZIFCRWetpFJiJbR+oHsLYMoG5TMvdj3ozFw=; b=qngAI4HXxWSLNa9XUs5TAJF+vgoq4Cc/LvRbQmhE5gljHmx2nn8Z34p+2Bu+KU/yat EY5WVKLBE45uWq2xFQ+sKZ7lEArbBOmIeqXYPXu8jFCgKDitdMvwv+gEUZRsuTEG7KMM KkNo/qMBySzgTWUfUzqWHbDQrSaMJmV1NbkZ7okqDOWkfmcYQ1GE6DNzeGbbOb0zkl/J 1lVYBpuw0KA5bilZlqc8ObDgcQwES5LxSEAKplqIFSX4BYHAiRSb/07L/E1bvml1huzf 2sQIyAT2QqN1IFUI9D52Ao+eR0FWo5u/fZSNVRPkGuqQPvT7kMke5V9M986cANuxfHUl silA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iQ1SPZi4ZIFCRWetpFJiJbR+oHsLYMoG5TMvdj3ozFw=; b=cP4nyGwS/BQnK33MOJ+DQaE0thrDapGVVdSFm+nNlZVC3T7F06rFVM/bSkiLDodt1l XQ/dLmmALNs8XrcJlBcoma0waq66oWsGvMd2GpZRHPhpPKCBBKMdFKPk1sbGHhCqZZ1g rOq8kA/G2nUSO4RtCgHCLyUnEJV2IuwVCO7hzmsCS2DbzE3ApBhgUMAtnTz42XEeciy5 FJ5xeJVoxEZAk0rkoGZ776ZUKJtBSjL+BunrhoU16ks8JocoaXRVQYl2nIS1IluU/afl Ik+UbJWvE8JXd+rXIb+/AJFk3bpkKWogbtz+lTWdBsMht0dTBnAXqbYRMW4VZy1Zc8Xy KSdg== X-Gm-Message-State: AOAM530QfNqWWQ2+oJpD8MReDAe1KdVd/LxnkFuqges0nLKiwE7yESts cfajuZMCiR3igfaI96QUqpkU1CWNGs/lt9rw X-Google-Smtp-Source: ABdhPJykNa9NM0T7GlfMGLQ/qcKXprc722S4etB0MwRiWTfe4obYUm5y6FCJKtiWJLS/TiAzc0W3gw== X-Received: by 2002:a2e:508:: with SMTP id 8mr16365514ljf.207.1617626692386; Mon, 05 Apr 2021 05:44:52 -0700 (PDT) Received: from rimwks.local ([2001:470:1f15:3d8:d45f:d9af:2e75:d996]) by smtp.gmail.com with ESMTPSA id t10sm1774062lfk.58.2021.04.05.05.44.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Apr 2021 05:44:51 -0700 (PDT) From: Rozhuk Ivan X-Google-Original-From: Rozhuk Ivan Date: Mon, 5 Apr 2021 15:44:49 +0300 To: tuexen@freebsd.org Cc: freebsd-current@freebsd.org, freebsd-net Subject: Re: TCP Connection hang - MSS again Message-ID: <20210405154449.2d267589@rimwks.local> In-Reply-To: <0D7C52FC-DA37-41B6-A05C-F49ECEFE51FC@freebsd.org> References: <20210405124450.7505b43c@rimwks.local> <0D7C52FC-DA37-41B6-A05C-F49ECEFE51FC@freebsd.org> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; amd64-portbld-freebsd13.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4FDVjf0bR8z4TgS X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=qngAI4HX; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of rozhukim@gmail.com designates 2a00:1450:4864:20::232 as permitted sender) smtp.mailfrom=rozhukim@gmail.com X-Spamd-Result: default: False [-4.00 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::232:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::232:from:127.0.2.255]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::232:from]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-current,freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 12:44:55 -0000 On Mon, 5 Apr 2021 13:04:19 +0200 tuexen@freebsd.org wrote: > > As I understand, in some cases remote host does not reply with MSS > > option, and host behind router continue use mss 8960, that dropped > > by router. > If the peer does not provide an MSS option, your local FreeBSD based > host should use an MSS of net.inet.tcp.mssdflt bytes. The default is > 536. So I don't think this should be a problem. Thats it! Thanks, it was ~64k in mine config. From owner-freebsd-net@freebsd.org Mon Apr 5 13:25:57 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7A6E75B8EC1 for ; Mon, 5 Apr 2021 13:25:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4FDWd12ddXz4Y0P for ; Mon, 5 Apr 2021 13:25:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 5A3895B8E4E; Mon, 5 Apr 2021 13:25:57 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 59CBA5B8EC0 for ; Mon, 5 Apr 2021 13:25:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDWd11s2Tz4YPL for ; Mon, 5 Apr 2021 13:25:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 30E92193E7 for ; Mon, 5 Apr 2021 13:25:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 135DPvma081283 for ; Mon, 5 Apr 2021 13:25:57 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 135DPv0t081282 for net@FreeBSD.org; Mon, 5 Apr 2021 13:25:57 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 220468] libfetch: Does not handle 407 (proxy auth) when connecting to HTTPS using connect tunnel Date: Mon, 05 Apr 2021 13:25:55 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: garga@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: garga@FreeBSD.org X-Bugzilla-Flags: mfc-stable13? mfc-stable12? mfc-stable11? X-Bugzilla-Changed-Fields: bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 13:25:57 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220468 Renato Botelho changed: What |Removed |Added ---------------------------------------------------------------------------- Status|Open |In Progress --- Comment #17 from Renato Botelho --- (In reply to Kubilay Kocak from comment #16) It's too late for releng/13.0, we already have an RC5. I plan to merge it = to all supported stable branches --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Mon Apr 5 13:52:48 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 89E535B9D3E for ; Mon, 5 Apr 2021 13:52:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FDXD030KFz4bJB for ; Mon, 5 Apr 2021 13:52:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 662105B9F23; Mon, 5 Apr 2021 13:52:48 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 656F55B9DAC for ; Mon, 5 Apr 2021 13:52:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDXD027kXz4bT0 for ; Mon, 5 Apr 2021 13:52:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 397E5199D6 for ; Mon, 5 Apr 2021 13:52:48 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 135DqmJI096299 for ; Mon, 5 Apr 2021 13:52:48 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 135Dqm3C096298 for net@FreeBSD.org; Mon, 5 Apr 2021 13:52:48 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254623] traceroute6: ICMP6 no longer works due to Capsicum'ization: data too short (-1 bytes) from invalid Date: Mon, 05 Apr 2021 13:52:47 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 13.0-STABLE X-Bugzilla-Keywords: needs-qa, regression X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: commit-hook@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: markj@FreeBSD.org X-Bugzilla-Flags: maintainer-feedback+ mfc-stable13? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 13:52:48 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254623 --- Comment #9 from commit-hook@FreeBSD.org --- A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3Dd3f2c31b43b726ffbb180a42cee4b9f00= c5ad5ed commit d3f2c31b43b726ffbb180a42cee4b9f00c5ad5ed Author: Mark Johnston AuthorDate: 2021-04-01 13:58:32 +0000 Commit: Mark Johnston CommitDate: 2021-04-05 13:51:56 +0000 traceroute6: Fix Capsicum rights for rcvsock - Always use distinct sockets for send and recv - Limit rights on the recv socket For ICMP6 we were using the same socket for both send and receive, and we limited rights on the socket such that it's impossible to receive anything. PR: 254623 Diagnosed by: Zhenlei Huang Reviewed by: oshogbo Differential Revision: https://reviews.freebsd.org/D29523 (cherry picked from commit b8ae450f05e62a851f444edaf7db2506ff99aa37) usr.sbin/traceroute6/traceroute6.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Mon Apr 5 14:46:26 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 329B95BBCAF for ; Mon, 5 Apr 2021 14:46:26 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDYPs6x82z4g5L for ; Mon, 5 Apr 2021 14:46:24 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:98a3:657e:a126:66e5] (unknown [IPv6:2a02:8109:1140:c3d:98a3:657e:a126:66e5]) (Authenticated sender: macmic) by drew.franken.de (Postfix) with ESMTPSA id 39ADC782F059D; Mon, 5 Apr 2021 16:46:19 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs From: tuexen@freebsd.org In-Reply-To: Date: Mon, 5 Apr 2021 16:46:18 +0200 Cc: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> To: Rick Macklem X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FDYPs6x82z4g5L X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 14:46:26 -0000 > On 5. Apr 2021, at 01:12, Rick Macklem wrote: >=20 > tuexen@freebsd.org wrote: >>> On 4. Apr 2021, at 22:28, Rick Macklem wrote: >>>=20 >>> Oops, yes the packet capture is on freefall (forgot to mention = that;-). >>> You should be able to: >>> % fetch https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap >>>=20 >>> Some useful packet #s are: >>> 1949 - partitioning starts >>> 2005 - partition healed >>> 2060 - last RST >>> 2067 - SYN -> gets going again >>>=20 >>> This was taken at the Linux end. I have FreeBSD end too, although I >>> don't think it tells you anything more. >> Hi Rick, >>=20 >> I would like to look at the FreeBSD side, too.=20 > fetch https://people.freebsd.org/~rmacklem/freetolinuxnfs.pcap >=20 >> Do you also know, what >> the state of the TCP connection was when the SYN / ACK / RST game was >> going on? > Just ESTABLISHED when the battle goes on. > And it happens when the Send-Q is 0. > (If the Send-Q is not empty, it finds its way to CLOSED.) OK. What is the FreeBSD version you are using? It seems that the TCP connection on the FreeBSD is still alive, Linux has decided to start a new TCP connection using the old port numbers. So it sends a SYN. The response is a challenge ACK and Linux responds with a RST. This looks good so far. However, FreeBSD should accept the RST and kill the TCP connection. The next SYN from the Linux side would establish a new TCP connection. So I'm wondering why the RST is not accepted. I made the timestamp checking stricter but introduced a bug where RST segments without timestamps were ignored. This was fixed. Introduced in main on 2020/11/09: https://svnweb.freebsd.org/changeset/base/367530 Introduced in stable/12 on 2020/11/30: https://svnweb.freebsd.org/changeset/base/36818 Fix in main on 2021/01/13:=20 = https://cgit.FreeBSD.org/src/commit/?id=3Dcc3c34859eab1b317d0f38731355b53f= 7d978c97 Fix in stable/12 on 2021/01/24: = https://cgit.FreeBSD.org/src/commit/?id=3Dd05d908d6d3c85479c84c707f9311484= 39ae826b Are you using a version which is affected by this bug? Best regards Michael >=20 > If I wait long enough before healing the partition, it will > go to FIN_WAIT_1, and then if I plug it back in, it does not > do battle (at least not for long). >=20 > Btw, I have one running now that seems stuck really good. > It has been 20minutes since I plugged the net cable back in. > (Unfortunately, I didn't have tcpdump running until after > I saw it was not progressing after healing. > --> There is one difference. There was a 6minute timeout > enabled on the server krpc for "no activity", which is > now disabled like it is for NFSv4.1 in freebsd-current. > I had forgotten to re-disable it. > So, when it does battle, it might have been the 6minute > timeout, which would then do the soshutdown(..SHUT_WR) > which kept it from getting "stuck" forever. > -->This time I had to reboot the FreeBSD NFS server to > get the Linux client unstuck, so this one looked a lot > like what has been reported. > The pcap for this one, started after the network was plugged > back in and I noticed it was stuck for quite a while is here: > fetch https://people.freebsd.org/~rmacklem/stuck.pcap >=20 > In it, there is just a bunch of RST followed by SYN sent > from client->FreeBSD and FreeBSD just keeps sending > acks for the old segment back. > --> It looks like FreeBSD did the "RST, ACK" after the > krpc did a soshutdown(..SHUT_WR) on the socket, > for the one you've been looking at. > I'll test some more... >=20 >> I would like to understand why the reestablishment of the connection >> did not work... > It is looking like it takes either a non-empty send-q or a > soshutdown(..SHUT_WR) to get the FreeBSD socket > out of established, where it just ignores the RSTs and > SYN packets. >=20 > Thanks for looking at it, rick >=20 > Best regards > Michael >>=20 >> Have fun with it, rick >>=20 >>=20 >> ________________________________________ >> From: tuexen@freebsd.org >> Sent: Sunday, April 4, 2021 12:41 PM >> To: Rick Macklem >> Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org >> Subject: Re: NFS Mount Hangs >>=20 >> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>=20 >>=20 >>> On 4. Apr 2021, at 17:27, Rick Macklem wrote: >>>=20 >>> Well, I'm going to cheat and top post, since this is elated info. = and >>> not really part of the discussion... >>>=20 >>> I've been testing network partitioning between a Linux client (5.2 = kernel) >>> and a FreeBSD-current NFS server. I have not gotten a solid hang, = but >>> I have had the Linux client doing "battle" with the FreeBSD server = for >>> several minutes after un-partitioning the connection. >>>=20 >>> The battle basically consists of the Linux client sending an RST, = followed >>> by a SYN. >>> The FreeBSD server ignores the RST and just replies with the same = old ack. >>> --> This varies from "just a SYN" that succeeds to 100+ cycles of = the above >>> over several minutes. >>>=20 >>> I had thought that an RST was a "pretty heavy hammer", but FreeBSD = seems >>> pretty good at ignoring it. >>>=20 >>> A full packet capture of one of these is in = /home/rmacklem/linuxtofreenfs.pcap >>> in case anyone wants to look at it. >> On freefall? I would like to take a look at it... >>=20 >> Best regards >> Michael >>>=20 >>> Here's a tcpdump snippet of the interesting part (see the *** = comments): >>> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 202585:202749, ack = 212293, win 29128, options [nop,nop,TS val 2073636037 ecr 2671204825], = length 164: NFS reply xid 613153685 reply ok 160 getattr NON 4 ids = 0/33554432 sz 0 >>> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 202749, win 501, options = [nop,nop,TS val 2671204825 ecr 2073636037], length 0 >>> *** Network is now partitioned... >>>=20 >>> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671204927 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205135 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205343 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>> *** Lots of lines snipped. >>>=20 >>>=20 >>> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>> *** Network is now unpartitioned... >>>=20 >>> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671421871 ecr 0,nop,wscale 7], length 0 >>> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073855137 ecr 2671204825], length 0 >>> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671424943 ecr 0,nop,wscale 7], length 0 >>> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073856161 ecr 2671204825], length 0 >>> *** This "battle" goes on for 223sec... >>> I snipped out 13 cycles of this "Linux sends an RST, followed by = SYN" >>> "FreeBSD replies with same old ACK". In another test run I saw this >>> cycle continue non-stop for several minutes. This time, the Linux >>> client paused for a while (see ARPs below). >>>=20 >>> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671428975 ecr 0,nop,wscale 7], length 0 >>> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073860193 ecr 2671204825], length 0 >>> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. >>>=20 >>> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074046846 ecr 2671204825], length 0 >>> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074062406 ecr 2671204825], length 0 >>> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [R.], seq 202750, ack 212293, win = 0, options [nop,nop,TS val 2074077981 ecr 2671204825], length 0 >>> *** FreeBSD finally acknowledges the RST 38sec after Linux sent the = last >>> of 13 (100+ for another test run). >>>=20 >>> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 4247692373, win 64240, options = [mss 1460,sackOK,TS val 2671667055 ecr 0,nop,wscale 7], length 0 >>> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [S.], seq 661237469, ack = 4247692374, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val = 2074098278 ecr 2671667055], length 0 >>> *** Now back in business... >>>=20 >>> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 1, win 502, options = [nop,nop,TS val 2671667055 ecr 2074098278], length 0 >>> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 232: NFS = request xid 629930901 228 getattr fh 0,1/53 >>> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 272: NFS = request xid 697039765 132 getattr fh 0,1/53 >>> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 505, win 29128, options = [nop,nop,TS val 2074098279 ecr 2671667056], length 0 >>> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098279], length 136: NFS = request xid 730594197 132 getattr fh 0,1/53 >>> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 1:49, ack 505, win = 29128, options [nop,nop,TS val 2074098279 ecr 2671667056], length 48: = NFS reply xid 697039765 reply ok 44 getattr ERROR: unk 10063 >>>=20 >>> This error 10063 after the partition heals is also "bad news". It = indicates the Session >>> (which is supposed to maintain "exactly once" RPC semantics is = broken). I'll admit I >>> suspect a Linux client bug, but will be investigating further. >>>=20 >>> So, hopefully TCP conversant folk can confirm if the above is = correct behaviour >>> or if the RST should be ack'd sooner? >>>=20 >>> I could also see this becoming a "forever" TCP battle for other = versions of Linux client. >>>=20 >>> rick >>>=20 >>>=20 >>> ________________________________________ >>> From: Scheffenegger, Richard >>> Sent: Sunday, April 4, 2021 7:50 AM >>> To: Rick Macklem; tuexen@freebsd.org >>> Cc: Youssef GHORBAL; freebsd-net@freebsd.org >>> Subject: Re: NFS Mount Hangs >>>=20 >>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>=20 >>>=20 >>> For what it=E2=80=98s worth, suse found two bugs in the linux = nfconntrack (stateful firewall), and pfifo-fast scheduler, which could = conspire to make tcp sessions hang forever. >>>=20 >>> One is a missed updaten when the c=C3=B6ient is not using the = noresvport moint option, which makes tje firewall think rsts are illegal = (and drop them); >>>=20 >>> The fast scheduler can run into an issue if only a single packet = should be forwarded (note that this is not the default scheduler, but = often recommended for perf, as it runs lockless and lower cpu cost that = pfq (default). If no other/additional packet pushes out that last packet = of a flow, it can become stuck forever... >>>=20 >>> I can try getting the relevant bug info next week... >>>=20 >>> ________________________________ >>> Von: owner-freebsd-net@freebsd.org = im Auftrag von Rick Macklem >>> Gesendet: Friday, April 2, 2021 11:31:01 PM >>> An: tuexen@freebsd.org >>> Cc: Youssef GHORBAL ; = freebsd-net@freebsd.org >>> Betreff: Re: NFS Mount Hangs >>>=20 >>> NetApp Security WARNING: This is an external email. Do not click = links or open attachments unless you recognize the sender and know the = content is safe. >>>=20 >>>=20 >>>=20 >>>=20 >>> tuexen@freebsd.org wrote: >>>>> On 2. Apr 2021, at 02:07, Rick Macklem = wrote: >>>>>=20 >>>>> I hope you don't mind a top post... >>>>> I've been testing network partitioning between the only Linux = client >>>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch >>>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >>>>> applied to it. >>>>>=20 >>>>> I'm not enough of a TCP guy to know if this is useful, but here's = what >>>>> I see... >>>>>=20 >>>>> While partitioned: >>>>> On the FreeBSD server end, the socket either goes to CLOSED during >>>>> the network partition or stays ESTABLISHED. >>>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer = also >>>> sent a FIN, but you never called close() on the socket. >>>> If the socket stays in ESTABLISHED, there is no communication = ongoing, >>>> I guess, and therefore the server does not even detect that the = peer >>>> is not reachable. >>>>> On the Linux end, the socket seems to remain ESTABLISHED for a >>>>> little while, and then disappears. >>>> So how does Linux detect the peer is not reachable? >>> Well, here's what I see in a packet capture in the Linux client once >>> I partition it (just unplug the net cable): >>> - lots of retransmits of the same segment (with ACK) for 54sec >>> - then only ARP queries >>>=20 >>> Once I plug the net cable back in: >>> - ARP works >>> - one more retransmit of the same segement >>> - receives RST from FreeBSD >>> ** So, is this now a "new" TCP connection, despite >>> using the same port#. >>> --> It matters for NFS, since "new connection" >>> implies "must retry all outstanding RPCs". >>> - sends SYN >>> - receives SYN, ACK from FreeBSD >>> --> connection starts working again >>> Always uses same port#. >>>=20 >>> On the FreeBSD server end: >>> - receives the last retransmit of the segment (with ACK) >>> - sends RST >>> - receives SYN >>> - sends SYN, ACK >>>=20 >>> I thought that there was no RST in the capture I looked at >>> yesterday, so I'm not sure if FreeBSD always sends an RST, >>> but the Linux client behaviour was the same. (Sent a SYN, etc). >>> The socket disappears from the Linux "netstat -a" and I >>> suspect that happens after about 54sec, but I am not sure >>> about the timing. >>>=20 >>>>>=20 >>>>> After unpartitioning: >>>>> On the FreeBSD server end, you get another socket showing up at >>>>> the same port# >>>>> Active Internet connections (including servers) >>>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = ESTABLISHED >>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = CLOSED >>>>>=20 >>>>> The Linux client shows the same connection ESTABLISHED. >>> But disappears from "netstat -a" for a while during the = partitioning. >>>=20 >>>>> (The mount sometimes reports an error. I haven't looked at packet >>>>> traces to see if it retries RPCs or why the errors occur.) >>> I have now done so, as above. >>>=20 >>>>> --> However I never get hangs. >>>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server >>>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >>>>> mount starts working again. >>>>>=20 >>>>> The most obvious thing is that the Linux client always keeps using >>>>> the same port#. (The FreeBSD client will use a different port# = when >>>>> it does a TCP reconnect after no response from the NFS server for >>>>> a little while.) >>>>>=20 >>>>> What do those TCP conversant think? >>>> I guess you are you are never calling close() on the socket, for = with >>>> the connection state is CLOSED. >>> Ok, that makes sense. For this case the Linux client has not done a >>> BindConnectionToSession to re-assign the back channel. >>> I'll have to bug them about this. However, I'll bet they'll answer >>> that I have to tell them the back channel needs re-assignment >>> or something like that. >>>=20 >>> I am pretty certain they are broken, in that the client needs to >>> retry all outstanding RPCs. >>>=20 >>> For others, here's the long winded version of this that I just >>> put on the phabricator review: >>> In the server side kernel RPC, the socket (struct socket *) is in a >>> structure called SVCXPRT (normally pointed to by "xprt"). >>> These structures a ref counted and the soclose() is done >>> when the ref. cnt goes to zero. My understanding is that >>> "struct socket *" is free'd by soclose() so this cannot be done >>> before the xprt ref. cnt goes to zero. >>>=20 >>> For NFSv4.1/4.2 there is something called a back channel >>> which means that a "xprt" is used for server->client RPCs, >>> although the TCP connection is established by the client >>> to the server. >>> --> This back channel holds a ref cnt on "xprt" until the >>>=20 >>> client re-assigns it to a different TCP connection >>> via an operation called BindConnectionToSession >>> and the Linux client is not doing this soon enough, >>> it appears. >>>=20 >>> So, the soclose() is delayed, which is why I think the >>> TCP connection gets stuck in CLOSE_WAIT and that is >>> why I've added the soshutdown(..SHUT_WR) calls, >>> which can happen before the client gets around to >>> re-assigning the back channel. >>>=20 >>> Thanks for your help with this Michael, rick >>>=20 >>> Best regards >>> Michael >>>>=20 >>>> rick >>>> ps: I can capture packets while doing this, if anyone has a use >>>> for them. >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>> ________________________________________ >>>> From: owner-freebsd-net@freebsd.org = on behalf of Youssef GHORBAL >>>> Sent: Saturday, March 27, 2021 6:57 PM >>>> To: Jason Breitman >>>> Cc: Rick Macklem; freebsd-net@freebsd.org >>>> Subject: Re: NFS Mount Hangs >>>>=20 >>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>> On 27 Mar 2021, at 13:20, Jason Breitman = > = wrote: >>>>=20 >>>> The issue happened again so we can say that disabling TSO and LRO = on the NIC did not resolve this issue. >>>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso = -vlanhwtso >>>> # ifconfig lagg0 >>>> lagg0: flags=3D8943 = metric 0 mtu 1500 >>>> = options=3D8100b8 >>>>=20 >>>> We can also say that the sysctl settings did not resolve this = issue. >>>>=20 >>>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 >>>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1 >>>>=20 >>>> # sysctl net.inet.tcp.finwait2_timeout=3D1000 >>>> net.inet.tcp.finwait2_timeout: 60000 -> 1000 >>>>=20 >>>> I don=E2=80=99t think those will do anything in your case since the = FIN_WAIT2 are on the client side and those sysctls are for BSD. >>>> By the way it seems that Linux recycles automatically TCP sessions = in FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) >>>>=20 >>>> tcp_fin_timeout (integer; default: 60; since Linux 2.2) >>>> This specifies how many seconds to wait for a final FIN >>>> packet before the socket is forcibly closed. This is >>>> strictly a violation of the TCP specification, but >>>> required to prevent denial-of-service attacks. In Linux >>>> 2.2, the default value was 180. >>>>=20 >>>> So I don=E2=80=99t get why it stucks in the FIN_WAIT2 state anyway. >>>>=20 >>>> You really need to have a packet capture during the outage (client = and server side) so you=E2=80=99ll get over the wire chat and start = speculating from there. >>>> No need to capture the beginning of the outage for now. All you = have to do, is run a tcpdump for 10 minutes or so when you notice a = client stuck. >>>>=20 >>>> * I have not rebooted the NFS Server nor have I restarted nfsd, but = do not believe that is required as these settings are at the TCP level = and I would expect new sessions to use the updated settings. >>>>=20 >>>> The issue occurred after 5 days following a reboot of the client = machines. >>>> I ran the capture information again to make use of the situation. >>>>=20 >>>> #!/bin/sh >>>>=20 >>>> while true >>>> do >>>> /bin/date >> /tmp/nfs-hang.log >>>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log >>>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log >>>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log >>>> /bin/sleep 60 >>>> done >>>>=20 >>>>=20 >>>> On the NFS Server >>>> Active Internet connections (including servers) >>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT >>>>=20 >>>> On the NFS Client >>>> tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2 >>>>=20 >>>>=20 >>>>=20 >>>> You had also asked for the output below. >>>>=20 >>>> # nfsstat -E -s >>>> BackChannelCtBindConnToSes >>>> 0 0 >>>>=20 >>>> # sysctl vfs.nfsd.request_space_throttle_count >>>> vfs.nfsd.request_space_throttle_count: 0 >>>>=20 >>>> I see that you are testing a patch and I look forward to seeing the = results. >>>>=20 >>>>=20 >>>> Jason Breitman >>>>=20 >>>>=20 >>>> On Mar 21, 2021, at 6:21 PM, Rick Macklem = > wrote: >>>>=20 >>>> Youssef GHORBAL = > wrote: >>>>> Hi Jason, >>>>>=20 >>>>>> On 17 Mar 2021, at 18:17, Jason Breitman = > = wrote: >>>>>>=20 >>>>>> Please review the details below and let me know if there is a = setting that I should apply to my FreeBSD NFS Server or if there is a = bug fix that I can apply to resolve my issue. >>>>>> I shared this information with the linux-nfs mailing list and = they believe the issue is on the server side. >>>>>>=20 >>>>>> Issue >>>>>> NFSv4 mounts periodically hang on the NFS Client. >>>>>>=20 >>>>>> During this time, it is possible to manually mount from another = NFS Server on the NFS Client having issues. >>>>>> Also, other NFS Clients are successfully mounting from the NFS = Server in question. >>>>>> Rebooting the NFS Client appears to be the only solution. >>>>>=20 >>>>> I had experienced a similar weird situation with periodically = stuck Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD = based but they seem to have there >own nfsd) >>>> Yes, my understanding is that Isilon uses a proprietary user space = nfsd and >>>> not the kernel based RPC and nfsd in FreeBSD. >>>>=20 >>>>> We=E2=80=99ve had better luck and we did manage to have packet = captures on both sides >during the issue. The gist of it goes like = follows: >>>>>=20 >>>>> - Data flows correctly between SERVER and the CLIENT >>>>> - At some point SERVER starts decreasing it's TCP Receive Window = until it reachs 0 >>>>> - The client (eager to send data) can only ack data sent by = SERVER. >>>>> - When SERVER was done sending data, the client starts sending TCP = Window >Probes hoping that the TCP Window opens again so he can flush = its buffers. >>>>> - SERVER responds with a TCP Zero Window to those probes. >>>> Having the window size drop to zero is not necessarily incorrect. >>>> If the server is overloaded (has a backlog of NFS requests), it can = stop doing >>>> soreceive() on the socket (so the socket rcv buffer can fill up and = the TCP window >>>> closes). This results in "backpressure" to stop the NFS client from = flooding the >>>> NFS server with requests. >>>> --> However, once the backlog is handled, the nfsd should start to = soreceive() >>>> again and this shouls cause the window to open back up. >>>> --> Maybe this is broken in the socket/TCP code. I quickly got lost = in >>>> tcp_output() when it decides what to do about the rcvwin. >>>>=20 >>>>> - After 6 minutes (the NFS server default Idle timeout) SERVER = racefully closes the >TCP connection sending a FIN Packet (and still a = TCP Window 0) >>>> This probably does not happen for Jason's case, since the 6minute = timeout >>>> is disabled when the TCP connection is assigned as a backchannel = (most likely >>>> the case for NFSv4.1). >>>>=20 >>>>> - CLIENT ACK that FIN. >>>>> - SERVER goes in FIN_WAIT_2 state >>>>> - CLIENT closes its half part part of the socket and goes in = LAST_ACK state. >>>>> - FIN is never sent by the client since there still data in its = SendQ and receiver TCP >Window is still 0. At this stage the client = starts sending TCP Window Probes again >and again hoping that the server = opens its TCP Window so it can flush it's buffers >and terminate its = side of the socket. >>>>> - SERVER keeps responding with a TCP Zero Window to those probes. >>>>> =3D> The last two steps goes on and on for hours/days freezing the = NFS mount bound >to that TCP session. >>>>>=20 >>>>> If we had a situation where CLIENT was responsible for closing the = TCP Window (and >initiating the TCP FIN first) and server wanting to = send data we=E2=80=99ll end up in the same >state as you I think. >>>>>=20 >>>>> We=E2=80=99ve never had the root cause of why the SERVER decided = to close the TCP >Window and no more acccept data, the fix on the Isilon = part was to recycle more >aggressively the FIN_WAIT_2 sockets = (net.inet.tcp.fast_finwait2_recycle=3D1 & = >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recycled and at = the next >occurence of CLIENT TCP Window probe, SERVER sends a RST, = triggering the >teardown of the session on the client side, a new TCP = handchake, etc and traffic >flows again (NFS starts responding) >>>>>=20 >>>>> To avoid rebooting the client (and before the aggressive = FIN_WAIT_2 was >implemented on the Isilon side) we=E2=80=99ve added a = check script on the client that detects >LAST_ACK sockets on the client = and through iptables rule enforces a TCP RST, >Something like: -A OUTPUT = -p tcp -d $nfs_server_addr --sport $local_port -j REJECT >--reject-with = tcp-reset (the script removes this iptables rule as soon as the LAST_ACK = >disappears) >>>>>=20 >>>>> The bottom line would be to have a packet capture during the = outage (client and/or >server side), it will show you at least the shape = of the TCP exchange when NFS is >stuck. >>>> Interesting story and good work w.r.t. sluething, Youssef, thanks. >>>>=20 >>>> I looked at Jason's log and it shows everything is ok w.r.t the = nfsd threads. >>>> (They're just waiting for RPC requests.) >>>> However, I do now think I know why the soclose() does not happen. >>>> When the TCP connection is assigned as a backchannel, that takes a = reference >>>> cnt on the structure. This refcnt won't be released until the = connection is >>>> replaced by a BindConnectiotoSession operation from the client. But = that won't >>>> happen until the client creates a new TCP connection. >>>> --> No refcnt release-->no refcnt of 0-->no soclose(). >>>>=20 >>>> I've created the attached patch (completely different from the = previous one) >>>> that adds soshutdown(SHUT_WR) calls in the three places where the = TCP >>>> connection is going away. This seems to get it past CLOSE_WAIT = without a >>>> soclose(). >>>> --> I know you are not comfortable with patching your server, but I = do think >>>> this change will get the socket shutdown to complete. >>>>=20 >>>> There are a couple more things you can check on the server... >>>> # nfsstat -E -s >>>> --> Look for the count under "BindConnToSes". >>>> --> If non-zero, backchannels have been assigned >>>> # sysctl -a | fgrep request_space_throttle_count >>>> --> If non-zero, the server has been overloaded at some point. >>>>=20 >>>> I think the attached patch might work around the problem. >>>> The code that should open up the receive window needs to be = checked. >>>> I am also looking at enabling the 6minute timeout when a = backchannel is >>>> assigned. >>>>=20 >>>> rick >>>>=20 >>>> Youssef >>>>=20 >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing = list >>>> = https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/fre= ebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZci= Lj4ObR_drJNv5yryaERfIbhKR2d$ >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>> >>>>=20 >>>> >>>>=20 >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>=20 >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >=20 From owner-freebsd-net@freebsd.org Mon Apr 5 19:32:57 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 86D755C4C09 for ; Mon, 5 Apr 2021 19:32:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4FDgmT2xScz3LnN for ; Mon, 5 Apr 2021 19:32:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 64A385C477A; Mon, 5 Apr 2021 19:32:57 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 644695C4C08 for ; Mon, 5 Apr 2021 19:32:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDgmT26L3z3LjK for ; Mon, 5 Apr 2021 19:32:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 389801E3B7 for ; Mon, 5 Apr 2021 19:32:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 135JWvoX070789 for ; Mon, 5 Apr 2021 19:32:57 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 135JWvem070788 for net@FreeBSD.org; Mon, 5 Apr 2021 19:32:57 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 240969] netinet6: Neighbour reachability detection broken when using multiple FIB Date: Mon, 05 Apr 2021 19:32:56 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.0-RELEASE X-Bugzilla-Keywords: needs-patch, needs-qa, regression X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: melifaro@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: melifaro@FreeBSD.org X-Bugzilla-Flags: mfc-stable12? mfc-stable11? X-Bugzilla-Changed-Fields: assigned_to bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 19:32:57 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D240969 Alexander V. Chernikov changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|net@FreeBSD.org |melifaro@FreeBSD.org Status|Open |In Progress --- Comment #2 from Alexander V. Chernikov --- Raised https://reviews.freebsd.org/D29592 to address the issue. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-net@freebsd.org Mon Apr 5 19:39:43 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8E3D45C4BE2 for ; Mon, 5 Apr 2021 19:39:43 +0000 (UTC) (envelope-from asif.eppyhatman@yandex.ru) Received: from forward105p.mail.yandex.net (forward105p.mail.yandex.net [77.88.28.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDgwG5HmPz3M11 for ; Mon, 5 Apr 2021 19:39:42 +0000 (UTC) (envelope-from asif.eppyhatman@yandex.ru) Received: from iva6-6aa4ee7025da.qloud-c.yandex.net (iva6-6aa4ee7025da.qloud-c.yandex.net [IPv6:2a02:6b8:c0c:6106:0:640:6aa4:ee70]) by forward105p.mail.yandex.net (Yandex) with ESMTP id 376B84D4118B for ; Mon, 5 Apr 2021 22:33:26 +0300 (MSK) Received: from mail.yandex.ru (mail.yandex.ru [176.59.33.39]) by iva6-6aa4ee7025da.qloud-c.yandex.net (mxback/Yandex) with HTTP id JXLgaJ0I3Cg1-XPISKbdS; Mon, 05 Apr 2021 22:33:25 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1617651205; bh=33e3Enfud/3F3y1wlJXX7CKK+HLHI8MoOGhBpAvusDk=; h=Message-Id:Date:Subject:To:From; b=R3DZet6fDjrCxR1WcbrPz0wZLOxF7s1FVc2flZRO5Rhr2fjLPOP2j4lglDZNgrY6R eDBo3iHTPDW+QCe5OtYlDgGJq20OharVWsSPNmb/TZZ1g7pB6FyjE+ccWltrWEvDeQ BE9RTmBK4JdfCJA6e+qMP9tAv9OFCipwtgqE+8gQ= Received: by iva5-58d151f416d2.qloud-c.yandex.net with HTTP; Mon, 05 Apr 2021 22:33:25 +0300 From: Asif Eppyhatman Envelope-From: asif-eppyhatman@yandex.ru To: "freebsd-net@freebsd.org" Subject: RINA-the-Recursive-InterNetwork-Architecture X-Mailer: Yamail [ http://yandex.ru ] 5.0 Date: Mon, 05 Apr 2021 22:33:25 +0300 Message-Id: <388521617646244@mail.yandex.ru> X-Rspamd-Queue-Id: 4FDgwG5HmPz3M11 X-Spamd-Bar: ++++++++ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yandex.ru header.s=mail header.b=R3DZet6f; dmarc=pass (policy=none) header.from=yandex.ru; spf=pass (mx1.freebsd.org: domain of asif.eppyhatman@yandex.ru designates 77.88.28.108 as permitted sender) smtp.mailfrom=asif.eppyhatman@yandex.ru X-Spamd-Result: default: False [8.30 / 15.00]; RWL_MAILSPIKE_GOOD(0.00)[77.88.28.108:from]; R_SPF_ALLOW(0.00)[+ip4:77.88.0.0/18:c]; FREEMAIL_FROM(0.00)[yandex.ru]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[yandex.ru:+]; DMARC_POLICY_ALLOW(0.00)[yandex.ru,none]; RECEIVED_SPAMHAUS_PBL(0.00)[176.59.33.39:received]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:~]; FREEMAIL_ENVFROM(0.00)[yandex.ru]; ASN(0.00)[asn:13238, ipnet:77.88.0.0/18, country:RU]; RBL_DBL_DONT_QUERY_IPS(0.00)[77.88.28.108:from]; DWL_DNSWL_NONE(0.00)[yandex.ru:dkim]; ARC_NA(0.00)[]; RECEIVED_SPAMHAUS_XBL(5.00)[176.59.33.39:received]; R_DKIM_ALLOW(0.00)[yandex.ru:s=mail]; RCVD_TLS_LAST(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_SPAM_SHORT(1.00)[0.999]; PREVIOUSLY_DELIVERED(0.00)[freebsd-net@freebsd.org]; NEURAL_SPAM_MEDIUM(1.00)[1.000]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_ZRD(0.00)[77.88.28.108:from:127.0.2.255]; BAD_REP_POLICIES(0.10)[]; NEURAL_SPAM_LONG(1.00)[1.000]; MIME_HTML_ONLY(0.20)[]; TO_DN_EQ_ADDR_ALL(0.00)[]; GREYLIST(0.00)[pass,body]; MAILMAN_DEST(0.00)[freebsd-net] X-Spam: Yes MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 19:39:43 -0000 From owner-freebsd-net@freebsd.org Mon Apr 5 23:24:08 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8EAD75CA538 for ; Mon, 5 Apr 2021 23:24:08 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on0629.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::629]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDmvC2kPJz3tJv; Mon, 5 Apr 2021 23:24:06 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mWELLvW/hB8GdGZaS3MpvESjplaqWbnQiXuKck2Su05mKQ+c+5TzD/TWSKoA4XwyZh8gMoc3TnmxZtcbGl0tluh+Xh/7W9CItwGy7RchuAnX+N+qKVaUHio40z7FaG3okxyRxR0TeK7Tvv3uC2VCaJKtgMv7gGt8dGtcDlcjXfa8kKC4x2c4vfzZsgdRh43WStsKg9Y3TSB38x+Xh3x14ofsUgNZQAUVymPFRWdrzXZd4KReD61UJDecIcf7kR6A0l9MtTrmfpuIe21T+1fvSOAsiw/NdtwPZCJHX3YYp59s/rLuD4zIEFPP8qbcyrTEIjBiEzWn1FxAP0oXRQxCLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IL0GlITw+8Ksf/vs10QibBX2TCpvcVtd2nkyR5wQ7VU=; b=OKtGM+313t+rffxekKXDtM2d7DCJuySvGNXDXNW5txzm8AD35poSqFLMxIJmLcMVzQo1ZYbbTPTjcD/l9PJbLeGfa3YappzO8LK4ujeuBdUvvBi+uYra8rQQ4Tb3bGIg/5KK6NrCx9jYrxhit3C0yuGt6mtlvPCaWekZxnfuZvEDIAepHFFJBnAaxo86k7Yw3LwqOUKkFBue+5+Ekt8wzo/aXbEg/wzvrm6GufM8uFsAvnTIyUP0wEmX75eX5sSmbKRaPPuC4Ty0ngKSl8ranw0z6LEwVHNAwbPWXMWW4T+4yLzoHDPP4XLu4F4S6i+pQN7PMCpgyFQ7BpvGFlOiSw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IL0GlITw+8Ksf/vs10QibBX2TCpvcVtd2nkyR5wQ7VU=; b=d2Wnc5Iy6d4Hoaq4qgouA2Lztniz+89IsZau2Uu293xJjz++sOKf3DuUN/ZuGP/JvxTNdIIfiOCPzNx7sGFReDdRhIhhcTnPumSkbfxU+TOgUA+BzClILXloQ66/1JjWg0bL+LGbtdhxas/1rHVObIWiPEPhSncfRwELV3bQG7Ndaj+ZVCD7q18cCmmOeb/BavKZxq2+1kuMEktFeQgO11+HpcKgRY+2i3hLQKVGyd40B9x5wEQFkwqxZrHS7G1LwPU6iw6qUCzhn0TQTSrnAXB2H/s+JgYqG2Gw0wTCkwteMTRm4b+KP7siDR8G9rWjg9nOW/kwuLP13BQgXPMTew== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQBPR0101MB0899.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:4::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.32; Mon, 5 Apr 2021 23:24:04 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.032; Mon, 5 Apr 2021 23:24:04 +0000 From: Rick Macklem To: "tuexen@freebsd.org" CC: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAG5EAgAA+DUKAAB1JAIAACEqkgAEOcgCAAI4UZg== Date: Mon, 5 Apr 2021 23:24:04 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 6a01662b-c7e0-4e90-8dc9-08d8f889e74b x-ms-traffictypediagnostic: YQBPR0101MB0899: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:2089; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: Sz30PP1nPBm5xV5bVTWZjwsBvTWLzdYCarKlXEMFcoqYJMpotwoj9mQ2M28rIh3W+YbVvAAHM8hmcrN2e8/TrApGebypJKScYBcNH/YfsKdKNt05Rrb0hJnd7V1BiY5KY4QknL4nGxRlQkPnAXCs4N1bUW9g8rGu1X7IjOi78Jv2vyC83cIsstoW37riD2VUo1aKgVlJytHshnqbCb+L2WdD4DbQSOPsOyBnPcHm55xFXP2wZ6Jwhaec2w9j5UjVPnFM7fU5zxTctU0xV7g1wTSk/Rb2VcFtdw0jhcaUphazqnqFDOWyRRQWKiKrKeJqHB3GzlwfAum3KfZXkGVKKKClmA6Gd+9OOF5oPneINtOLOxeJCPhnkt4HgYWR0SzdIF6pgQnFShMi0HvO3/fR0fIdJrGXNdvUWmGFVM79zAKzcVrEAsJQDrq4GGoHSyqomm4v+Hl0mddlF+RTERfy+RMeFVCMLfUHjsMm3gx21tPFP36plBy94cMklSnSkZVvWTRn4hOnQn64WkUFdMYmnYBoAWcnOE6nM/+Lg+TJu9xAby47TaDygForgtAj2ff9ssNkxvXner8IfH+6934fkLIgzOStFwtfBcT3ymIK3O/usBnXploFw02SlH3Gz4DxliCe7s3hTR5bWgXNKaRHC47ni0YIZg20ezd2wjBCVdKvdSCJx0vsaK6FbRSfCL+WZb9sjMAteQXtBtEmLdGlKfavk66T8H0lmLdrNa6fR5w= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(39860400002)(366004)(136003)(396003)(376002)(71200400001)(76116006)(91956017)(5660300002)(66946007)(66446008)(64756008)(66556008)(66476007)(83380400001)(6506007)(478600001)(8676002)(7696005)(52536014)(7116003)(9686003)(38100700001)(33656002)(2906002)(30864003)(4326008)(316002)(86362001)(786003)(966005)(3480700007)(54906003)(55016002)(186003)(66574015)(8936002)(53546011)(6916009)(579004)(559001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?L6AO4BfogyOs9Jq7S/tPLRib7Oo/j+AR0SyVB2cq6j0UOfQBCIA1m+VA?= =?Windows-1252?Q?1B5x9EvTStags4srL95czeQdfOe/7Kq9TQ2IhGy3PY0NM6+Omklkm7Te?= =?Windows-1252?Q?x2f+LSWzPMvBPjXTfaVk2Aac0Me3u3kWSM4uX/uP+dk++zWBXKM9vJ3P?= =?Windows-1252?Q?Rd0pxwXB1GnDfSETk9wguXo+UoCZtYb1UU1s9/WVd6RspHzbw8Dhe4VC?= =?Windows-1252?Q?mChtlF9ZwWpdzv3XC+BrRY90CpZzcb6VgW3rO4cXBBxHaOwwxQDVSCB6?= =?Windows-1252?Q?DplP2T8UV364EjpeZLqyPyASF2d5C2C0YIhGBC+JmegNAnnezvEY2D7d?= =?Windows-1252?Q?OCLvM3wqXkCnyBphw+qDVBUIzkgEu1zzlNVtsDpZ/hnjKntRgJSPEtho?= =?Windows-1252?Q?WjVY+w4KMbTkqc7MLGntpIwXhgbJj37rvUfQZNEqLGo0bliQQ/kPyFwI?= =?Windows-1252?Q?cE320h1EO6Cv/72WExc7Gj1h5U8ogtr89kw+Fq5PeIHII5qkpmHyGxSZ?= =?Windows-1252?Q?gd7u3gK17UdrDgY1nIuMEg5JIVhPX3boWcnVnhKoeIHl+SH4c3sPwSbl?= =?Windows-1252?Q?lxb/1cDJxt/jNi+YkbR0mq9q0zDvwXb0b4idBgvVOzkQoPu7DO7XjiNi?= =?Windows-1252?Q?9Adjv5UXdcc3h+71d8S430beCVCAUPamJPypaqci/R0a0C0khFoGuN76?= =?Windows-1252?Q?p3pTHRW3bEOMgihGq0YNvx70MuRbk11Rme4aHhaURrzdJRP8WGiMzOgM?= =?Windows-1252?Q?aadcQFhAri/1CxJDQiQjLzd9uOz+Q2+Mfdu7NS9TaRqVEmOKvJXolDuK?= =?Windows-1252?Q?2luKxbbYbms7bEGAnPsWLTiQyMVQ11LEWQByajrtpnTJNVtlSv6DrSOQ?= =?Windows-1252?Q?Az1epSD1Dq4IWKe1epbXwJJK6lcZy99/jXuEYw2w6rlRCaZ86/h9JTMi?= =?Windows-1252?Q?iKfdjN03yZLmVntAqHY4yUXdNlApORngtE7hAzLcDuWxjnyJgW0xKqsa?= =?Windows-1252?Q?EJuj5cji7tRb8N7McFsI8xYfzWGdqZ86a8Y1D+vPk9aTFNsdmjPcOUsr?= =?Windows-1252?Q?MZcmXgd7+Pc2bHP8KWTKRrh4bzparB0EbdQfxy4HpFbcIsgP9SjbeQys?= =?Windows-1252?Q?w+XTf2JvVJyTh8y5HsmlCnfi+G0ulsdepMYZI37P/594nz9cxG8Xf70i?= =?Windows-1252?Q?aFBvcUjDBHmSIsWgxqqaTH8SyEmNalHtGsN33fhAND8QC2gm4gbJz95l?= =?Windows-1252?Q?poNTNWmqEdx+PDFOCfzumHdWVpXXNyjfx70Kn9yIw26A06PIyurawCy9?= =?Windows-1252?Q?QrazHq5XlxlYZGp9Thsk7MDj84cmg/jJ4tO1/k9AMQVbyQ8H1wSiayOe?= =?Windows-1252?Q?etnfAUa9KAU9QpBlYPlN3VLMn1jnQutZ4f2oGf6KRD07si9ROfxgRQAx?= =?Windows-1252?Q?P7dSR7g3STx/0LsPu52A8+fxWSbwNCKAsa5u4gpejXShLPhkN6WYHP3l?= =?Windows-1252?Q?gmQm3GBi?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 6a01662b-c7e0-4e90-8dc9-08d8f889e74b X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Apr 2021 23:24:04.6572 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: RmZj4M8GVrSQXgirfjCEq3txxkVWOC1+qGimrQ1e/NrAcUjSiOjv894jFeC2RK66gRuAQB0vUCyiGFnjStDt2A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB0899 X-Rspamd-Queue-Id: 4FDmvC2kPJz3tJv X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=d2Wnc5Iy; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::629 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5d::629:from]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5d::629:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 23:24:08 -0000 tuexen@freebsd.org wrote:=0A= [stuff snipped]=0A= >OK. What is the FreeBSD version you are using?=0A= main Dec. 23, 2020.=0A= =0A= >=0A= >It seems that the TCP connection on the FreeBSD is still alive,=0A= >Linux has decided to start a new TCP connection using the old=0A= >port numbers. So it sends a SYN. The response is a challenge ACK=0A= >and Linux responds with a RST. This looks good so far. However,=0A= >FreeBSD should accept the RST and kill the TCP connection. The=0A= >next SYN from the Linux side would establish a new TCP connection.=0A= >=0A= >So I'm wondering why the RST is not accepted. I made the timestamp=0A= >checking stricter but introduced a bug where RST segments without=0A= >timestamps were ignored. This was fixed.=0A= >=0A= >Introduced in main on 2020/11/09:=0A= > https://svnweb.freebsd.org/changeset/base/367530=0A= >Introduced in stable/12 on 2020/11/30:=0A= > https://svnweb.freebsd.org/changeset/base/36818=0A= >Fix in main on 2021/01/13:=0A= > https://cgit.FreeBSD.org/src/commit/?id=3Dcc3c34859eab1b317d0f38731355b5= 3f7d978c97=0A= >Fix in stable/12 on 2021/01/24:=0A= > https://cgit.FreeBSD.org/src/commit/?id=3Dd05d908d6d3c85479c84c707f93114= 8439ae826b=0A= >=0A= >Are you using a version which is affected by this bug?=0A= I was. Now I've applied the patch.=0A= Bad News. It did not fix the problem.=0A= It still gets into an endless "ignore RST" and stay established when=0A= the Send-Q is empty.=0A= =0A= If the Send-Q is non-empty when I partition, it recovers fine,=0A= sometimes not even needing to see an RST.=0A= =0A= rick=0A= ps: If you think there might be other recent changes that matter,=0A= just say the word and I'll upgrade to bits de jur.=0A= =0A= rick=0A= =0A= Best regards=0A= Michael=0A= >=0A= > If I wait long enough before healing the partition, it will=0A= > go to FIN_WAIT_1, and then if I plug it back in, it does not=0A= > do battle (at least not for long).=0A= >=0A= > Btw, I have one running now that seems stuck really good.=0A= > It has been 20minutes since I plugged the net cable back in.=0A= > (Unfortunately, I didn't have tcpdump running until after=0A= > I saw it was not progressing after healing.=0A= > --> There is one difference. There was a 6minute timeout=0A= > enabled on the server krpc for "no activity", which is=0A= > now disabled like it is for NFSv4.1 in freebsd-current.=0A= > I had forgotten to re-disable it.=0A= > So, when it does battle, it might have been the 6minute=0A= > timeout, which would then do the soshutdown(..SHUT_WR)=0A= > which kept it from getting "stuck" forever.=0A= > -->This time I had to reboot the FreeBSD NFS server to=0A= > get the Linux client unstuck, so this one looked a lot=0A= > like what has been reported.=0A= > The pcap for this one, started after the network was plugged=0A= > back in and I noticed it was stuck for quite a while is here:=0A= > fetch https://people.freebsd.org/~rmacklem/stuck.pcap=0A= >=0A= > In it, there is just a bunch of RST followed by SYN sent=0A= > from client->FreeBSD and FreeBSD just keeps sending=0A= > acks for the old segment back.=0A= > --> It looks like FreeBSD did the "RST, ACK" after the=0A= > krpc did a soshutdown(..SHUT_WR) on the socket,=0A= > for the one you've been looking at.=0A= > I'll test some more...=0A= >=0A= >> I would like to understand why the reestablishment of the connection=0A= >> did not work...=0A= > It is looking like it takes either a non-empty send-q or a=0A= > soshutdown(..SHUT_WR) to get the FreeBSD socket=0A= > out of established, where it just ignores the RSTs and=0A= > SYN packets.=0A= >=0A= > Thanks for looking at it, rick=0A= >=0A= > Best regards=0A= > Michael=0A= >>=0A= >> Have fun with it, rick=0A= >>=0A= >>=0A= >> ________________________________________=0A= >> From: tuexen@freebsd.org =0A= >> Sent: Sunday, April 4, 2021 12:41 PM=0A= >> To: Rick Macklem=0A= >> Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org=0A= >> Subject: Re: NFS Mount Hangs=0A= >>=0A= >> CAUTION: This email originated from outside of the University of Guelph.= Do not click links or open attachments unless you recognize the sender and= know the content is safe. If in doubt, forward suspicious emails to IThelp= @uoguelph.ca=0A= >>=0A= >>=0A= >>> On 4. Apr 2021, at 17:27, Rick Macklem wrote:=0A= >>>=0A= >>> Well, I'm going to cheat and top post, since this is elated info. and= =0A= >>> not really part of the discussion...=0A= >>>=0A= >>> I've been testing network partitioning between a Linux client (5.2 kern= el)=0A= >>> and a FreeBSD-current NFS server. I have not gotten a solid hang, but= =0A= >>> I have had the Linux client doing "battle" with the FreeBSD server for= =0A= >>> several minutes after un-partitioning the connection.=0A= >>>=0A= >>> The battle basically consists of the Linux client sending an RST, follo= wed=0A= >>> by a SYN.=0A= >>> The FreeBSD server ignores the RST and just replies with the same old a= ck.=0A= >>> --> This varies from "just a SYN" that succeeds to 100+ cycles of the a= bove=0A= >>> over several minutes.=0A= >>>=0A= >>> I had thought that an RST was a "pretty heavy hammer", but FreeBSD seem= s=0A= >>> pretty good at ignoring it.=0A= >>>=0A= >>> A full packet capture of one of these is in /home/rmacklem/linuxtofreen= fs.pcap=0A= >>> in case anyone wants to look at it.=0A= >> On freefall? I would like to take a look at it...=0A= >>=0A= >> Best regards=0A= >> Michael=0A= >>>=0A= >>> Here's a tcpdump snippet of the interesting part (see the *** comments)= :=0A= >>> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [P.], seq 202585:202749, ack 212293, win 29128, options [nop= ,nop,TS val 2073636037 ecr 2671204825], length 164: NFS reply xid 613153685= reply ok 160 getattr NON 4 ids 0/33554432 sz 0=0A= >>> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [.], ack 202749, win 501, options [nop,nop,TS val 2671204825= ecr 2073636037], length 0=0A= >>> *** Network is now partitioned...=0A= >>>=0A= >>> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,n= op,TS val 2671204927 ecr 2073636037], length 232: NFS request xid 629930901= 228 getattr fh 0,1/53=0A= >>> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,n= op,TS val 2671205135 ecr 2073636037], length 232: NFS request xid 629930901= 228 getattr fh 0,1/53=0A= >>> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,n= op,TS val 2671205343 ecr 2073636037], length 232: NFS request xid 629930901= 228 getattr fh 0,1/53=0A= >>> *** Lots of lines snipped.=0A= >>>=0A= >>>=0A= >>> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-li= nux.home.rick, length 28=0A= >>> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-li= nux.home.rick, length 28=0A= >>> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-li= nux.home.rick, length 28=0A= >>> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-li= nux.home.rick, length 28=0A= >>> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-li= nux.home.rick, length 28=0A= >>> *** Network is now unpartitioned...=0A= >>>=0A= >>> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72= (oui Unknown), length 46=0A= >>> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS v= al 2671421871 ecr 0,nop,wscale 7], length 0=0A= >>> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 20738551= 37 ecr 2671204825], length 0=0A= >>> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >>> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS v= al 2671424943 ecr 0,nop,wscale 7], length 0=0A= >>> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 20738561= 61 ecr 2671204825], length 0=0A= >>> *** This "battle" goes on for 223sec...=0A= >>> I snipped out 13 cycles of this "Linux sends an RST, followed by SYN"= =0A= >>> "FreeBSD replies with same old ACK". In another test run I saw this=0A= >>> cycle continue non-stop for several minutes. This time, the Linux=0A= >>> client paused for a while (see ARPs below).=0A= >>>=0A= >>> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >>> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS v= al 2671428975 ecr 0,nop,wscale 7], length 0=0A= >>> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 20738601= 93 ecr 2671204825], length 0=0A= >>> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST.=0A= >>>=0A= >>> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-li= nux.home.rick, length 28=0A= >>> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:72= (oui Unknown), length 46=0A= >>> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.= 1.254, length 46=0A= >>> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell 192.168.= 1.254, length 46=0A= >>> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS= val 2074046846 ecr 2671204825], length 0=0A= >>> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,TS= val 2074062406 ecr 2671204825], length 0=0A= >>> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [R.], seq 202750, ack 212293, win 0, options [nop,nop,TS val= 2074077981 ecr 2671204825], length 0=0A= >>> *** FreeBSD finally acknowledges the RST 38sec after Linux sent the las= t=0A= >>> of 13 (100+ for another test run).=0A= >>>=0A= >>> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [S], seq 4247692373, win 64240, options [mss 1460,sackOK,TS = val 2671667055 ecr 0,nop,wscale 7], length 0=0A= >>> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [S.], seq 661237469, ack 4247692374, win 65535, options [mss= 1460,nop,wscale 6,sackOK,TS val 2074098278 ecr 2671667055], length 0=0A= >>> *** Now back in business...=0A= >>>=0A= >>> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [.], ack 1, win 502, options [nop,nop,TS val 2671667055 ecr = 2074098278], length 0=0A= >>> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [P.], seq 1:233, ack 1, win 502, options [nop,nop,TS val 267= 1667056 ecr 2074098278], length 232: NFS request xid 629930901 228 getattr = fh 0,1/53=0A= >>> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [P.], seq 233:505, ack 1, win 502, options [nop,nop,TS val 2= 671667056 ecr 2074098278], length 272: NFS request xid 697039765 132 getatt= r fh 0,1/53=0A= >>> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [.], ack 505, win 29128, options [nop,nop,TS val 2074098279 = ecr 2671667056], length 0=0A= >>> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.ri= ck.nfsd: Flags [P.], seq 505:641, ack 1, win 502, options [nop,nop,TS val 2= 671667056 ecr 2074098279], length 136: NFS request xid 730594197 132 getatt= r fh 0,1/53=0A= >>> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.ap= ex-mesh: Flags [P.], seq 1:49, ack 505, win 29128, options [nop,nop,TS val = 2074098279 ecr 2671667056], length 48: NFS reply xid 697039765 reply ok 44 = getattr ERROR: unk 10063=0A= >>>=0A= >>> This error 10063 after the partition heals is also "bad news". It indic= ates the Session=0A= >>> (which is supposed to maintain "exactly once" RPC semantics is broken).= I'll admit I=0A= >>> suspect a Linux client bug, but will be investigating further.=0A= >>>=0A= >>> So, hopefully TCP conversant folk can confirm if the above is correct b= ehaviour=0A= >>> or if the RST should be ack'd sooner?=0A= >>>=0A= >>> I could also see this becoming a "forever" TCP battle for other version= s of Linux client.=0A= >>>=0A= >>> rick=0A= >>>=0A= >>>=0A= >>> ________________________________________=0A= >>> From: Scheffenegger, Richard =0A= >>> Sent: Sunday, April 4, 2021 7:50 AM=0A= >>> To: Rick Macklem; tuexen@freebsd.org=0A= >>> Cc: Youssef GHORBAL; freebsd-net@freebsd.org=0A= >>> Subject: Re: NFS Mount Hangs=0A= >>>=0A= >>> CAUTION: This email originated from outside of the University of Guelph= . Do not click links or open attachments unless you recognize the sender an= d know the content is safe. If in doubt, forward suspicious emails to IThel= p@uoguelph.ca=0A= >>>=0A= >>>=0A= >>> For what it=91s worth, suse found two bugs in the linux nfconntrack (st= ateful firewall), and pfifo-fast scheduler, which could conspire to make tc= p sessions hang forever.=0A= >>>=0A= >>> One is a missed updaten when the c=F6ient is not using the noresvport m= oint option, which makes tje firewall think rsts are illegal (and drop them= );=0A= >>>=0A= >>> The fast scheduler can run into an issue if only a single packet should= be forwarded (note that this is not the default scheduler, but often recom= mended for perf, as it runs lockless and lower cpu cost that pfq (default).= If no other/additional packet pushes out that last packet of a flow, it ca= n become stuck forever...=0A= >>>=0A= >>> I can try getting the relevant bug info next week...=0A= >>>=0A= >>> ________________________________=0A= >>> Von: owner-freebsd-net@freebsd.org im A= uftrag von Rick Macklem =0A= >>> Gesendet: Friday, April 2, 2021 11:31:01 PM=0A= >>> An: tuexen@freebsd.org =0A= >>> Cc: Youssef GHORBAL ; freebsd-net@freebsd.o= rg =0A= >>> Betreff: Re: NFS Mount Hangs=0A= >>>=0A= >>> NetApp Security WARNING: This is an external email. Do not click links = or open attachments unless you recognize the sender and know the content is= safe.=0A= >>>=0A= >>>=0A= >>>=0A= >>>=0A= >>> tuexen@freebsd.org wrote:=0A= >>>>> On 2. Apr 2021, at 02:07, Rick Macklem wrote:= =0A= >>>>>=0A= >>>>> I hope you don't mind a top post...=0A= >>>>> I've been testing network partitioning between the only Linux client= =0A= >>>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch=0A= >>>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken)=0A= >>>>> applied to it.=0A= >>>>>=0A= >>>>> I'm not enough of a TCP guy to know if this is useful, but here's wha= t=0A= >>>>> I see...=0A= >>>>>=0A= >>>>> While partitioned:=0A= >>>>> On the FreeBSD server end, the socket either goes to CLOSED during=0A= >>>>> the network partition or stays ESTABLISHED.=0A= >>>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also= =0A= >>>> sent a FIN, but you never called close() on the socket.=0A= >>>> If the socket stays in ESTABLISHED, there is no communication ongoing,= =0A= >>>> I guess, and therefore the server does not even detect that the peer= =0A= >>>> is not reachable.=0A= >>>>> On the Linux end, the socket seems to remain ESTABLISHED for a=0A= >>>>> little while, and then disappears.=0A= >>>> So how does Linux detect the peer is not reachable?=0A= >>> Well, here's what I see in a packet capture in the Linux client once=0A= >>> I partition it (just unplug the net cable):=0A= >>> - lots of retransmits of the same segment (with ACK) for 54sec=0A= >>> - then only ARP queries=0A= >>>=0A= >>> Once I plug the net cable back in:=0A= >>> - ARP works=0A= >>> - one more retransmit of the same segement=0A= >>> - receives RST from FreeBSD=0A= >>> ** So, is this now a "new" TCP connection, despite=0A= >>> using the same port#.=0A= >>> --> It matters for NFS, since "new connection"=0A= >>> implies "must retry all outstanding RPCs".=0A= >>> - sends SYN=0A= >>> - receives SYN, ACK from FreeBSD=0A= >>> --> connection starts working again=0A= >>> Always uses same port#.=0A= >>>=0A= >>> On the FreeBSD server end:=0A= >>> - receives the last retransmit of the segment (with ACK)=0A= >>> - sends RST=0A= >>> - receives SYN=0A= >>> - sends SYN, ACK=0A= >>>=0A= >>> I thought that there was no RST in the capture I looked at=0A= >>> yesterday, so I'm not sure if FreeBSD always sends an RST,=0A= >>> but the Linux client behaviour was the same. (Sent a SYN, etc).=0A= >>> The socket disappears from the Linux "netstat -a" and I=0A= >>> suspect that happens after about 54sec, but I am not sure=0A= >>> about the timing.=0A= >>>=0A= >>>>>=0A= >>>>> After unpartitioning:=0A= >>>>> On the FreeBSD server end, you get another socket showing up at=0A= >>>>> the same port#=0A= >>>>> Active Internet connections (including servers)=0A= >>>>> Proto Recv-Q Send-Q Local Address Foreign Address (st= ate)=0A= >>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 EST= ABLISHED=0A= >>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 CLO= SED=0A= >>>>>=0A= >>>>> The Linux client shows the same connection ESTABLISHED.=0A= >>> But disappears from "netstat -a" for a while during the partitioning.= =0A= >>>=0A= >>>>> (The mount sometimes reports an error. I haven't looked at packet=0A= >>>>> traces to see if it retries RPCs or why the errors occur.)=0A= >>> I have now done so, as above.=0A= >>>=0A= >>>>> --> However I never get hangs.=0A= >>>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server=0A= >>>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the=0A= >>>>> mount starts working again.=0A= >>>>>=0A= >>>>> The most obvious thing is that the Linux client always keeps using=0A= >>>>> the same port#. (The FreeBSD client will use a different port# when= =0A= >>>>> it does a TCP reconnect after no response from the NFS server for=0A= >>>>> a little while.)=0A= >>>>>=0A= >>>>> What do those TCP conversant think?=0A= >>>> I guess you are you are never calling close() on the socket, for with= =0A= >>>> the connection state is CLOSED.=0A= >>> Ok, that makes sense. For this case the Linux client has not done a=0A= >>> BindConnectionToSession to re-assign the back channel.=0A= >>> I'll have to bug them about this. However, I'll bet they'll answer=0A= >>> that I have to tell them the back channel needs re-assignment=0A= >>> or something like that.=0A= >>>=0A= >>> I am pretty certain they are broken, in that the client needs to=0A= >>> retry all outstanding RPCs.=0A= >>>=0A= >>> For others, here's the long winded version of this that I just=0A= >>> put on the phabricator review:=0A= >>> In the server side kernel RPC, the socket (struct socket *) is in a=0A= >>> structure called SVCXPRT (normally pointed to by "xprt").=0A= >>> These structures a ref counted and the soclose() is done=0A= >>> when the ref. cnt goes to zero. My understanding is that=0A= >>> "struct socket *" is free'd by soclose() so this cannot be done=0A= >>> before the xprt ref. cnt goes to zero.=0A= >>>=0A= >>> For NFSv4.1/4.2 there is something called a back channel=0A= >>> which means that a "xprt" is used for server->client RPCs,=0A= >>> although the TCP connection is established by the client=0A= >>> to the server.=0A= >>> --> This back channel holds a ref cnt on "xprt" until the=0A= >>>=0A= >>> client re-assigns it to a different TCP connection=0A= >>> via an operation called BindConnectionToSession=0A= >>> and the Linux client is not doing this soon enough,=0A= >>> it appears.=0A= >>>=0A= >>> So, the soclose() is delayed, which is why I think the=0A= >>> TCP connection gets stuck in CLOSE_WAIT and that is=0A= >>> why I've added the soshutdown(..SHUT_WR) calls,=0A= >>> which can happen before the client gets around to=0A= >>> re-assigning the back channel.=0A= >>>=0A= >>> Thanks for your help with this Michael, rick=0A= >>>=0A= >>> Best regards=0A= >>> Michael=0A= >>>>=0A= >>>> rick=0A= >>>> ps: I can capture packets while doing this, if anyone has a use=0A= >>>> for them.=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>> ________________________________________=0A= >>>> From: owner-freebsd-net@freebsd.org on= behalf of Youssef GHORBAL =0A= >>>> Sent: Saturday, March 27, 2021 6:57 PM=0A= >>>> To: Jason Breitman=0A= >>>> Cc: Rick Macklem; freebsd-net@freebsd.org=0A= >>>> Subject: Re: NFS Mount Hangs=0A= >>>>=0A= >>>> CAUTION: This email originated from outside of the University of Guelp= h. Do not click links or open attachments unless you recognize the sender a= nd know the content is safe. If in doubt, forward suspicious emails to IThe= lp@uoguelph.ca=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>> On 27 Mar 2021, at 13:20, Jason Breitman > wrote:=0A= >>>>=0A= >>>> The issue happened again so we can say that disabling TSO and LRO on t= he NIC did not resolve this issue.=0A= >>>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwts= o=0A= >>>> # ifconfig lagg0=0A= >>>> lagg0: flags=3D8943 me= tric 0 mtu 1500=0A= >>>> options=3D8100b8=0A= >>>>=0A= >>>> We can also say that the sysctl settings did not resolve this issue.= =0A= >>>>=0A= >>>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1=0A= >>>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1=0A= >>>>=0A= >>>> # sysctl net.inet.tcp.finwait2_timeout=3D1000=0A= >>>> net.inet.tcp.finwait2_timeout: 60000 -> 1000=0A= >>>>=0A= >>>> I don=92t think those will do anything in your case since the FIN_WAIT= 2 are on the client side and those sysctls are for BSD.=0A= >>>> By the way it seems that Linux recycles automatically TCP sessions in = FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout)=0A= >>>>=0A= >>>> tcp_fin_timeout (integer; default: 60; since Linux 2.2)=0A= >>>> This specifies how many seconds to wait for a final FIN=0A= >>>> packet before the socket is forcibly closed. This is=0A= >>>> strictly a violation of the TCP specification, but=0A= >>>> required to prevent denial-of-service attacks. In Linux=0A= >>>> 2.2, the default value was 180.=0A= >>>>=0A= >>>> So I don=92t get why it stucks in the FIN_WAIT2 state anyway.=0A= >>>>=0A= >>>> You really need to have a packet capture during the outage (client and= server side) so you=92ll get over the wire chat and start speculating from= there.=0A= >>>> No need to capture the beginning of the outage for now. All you have t= o do, is run a tcpdump for 10 minutes or so when you notice a client stuck.= =0A= >>>>=0A= >>>> * I have not rebooted the NFS Server nor have I restarted nfsd, but do= not believe that is required as these settings are at the TCP level and I = would expect new sessions to use the updated settings.=0A= >>>>=0A= >>>> The issue occurred after 5 days following a reboot of the client machi= nes.=0A= >>>> I ran the capture information again to make use of the situation.=0A= >>>>=0A= >>>> #!/bin/sh=0A= >>>>=0A= >>>> while true=0A= >>>> do=0A= >>>> /bin/date >> /tmp/nfs-hang.log=0A= >>>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log=0A= >>>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log=0A= >>>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log=0A= >>>> /bin/sleep 60=0A= >>>> done=0A= >>>>=0A= >>>>=0A= >>>> On the NFS Server=0A= >>>> Active Internet connections (including servers)=0A= >>>> Proto Recv-Q Send-Q Local Address Foreign Address (sta= te)=0A= >>>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT=0A= >>>>=0A= >>>> On the NFS Client=0A= >>>> tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>> You had also asked for the output below.=0A= >>>>=0A= >>>> # nfsstat -E -s=0A= >>>> BackChannelCtBindConnToSes=0A= >>>> 0 0=0A= >>>>=0A= >>>> # sysctl vfs.nfsd.request_space_throttle_count=0A= >>>> vfs.nfsd.request_space_throttle_count: 0=0A= >>>>=0A= >>>> I see that you are testing a patch and I look forward to seeing the re= sults.=0A= >>>>=0A= >>>>=0A= >>>> Jason Breitman=0A= >>>>=0A= >>>>=0A= >>>> On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote:=0A= >>>>=0A= >>>> Youssef GHORBAL > wrote:=0A= >>>>> Hi Jason,=0A= >>>>>=0A= >>>>>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote:=0A= >>>>>>=0A= >>>>>> Please review the details below and let me know if there is a settin= g that I should apply to my FreeBSD NFS Server or if there is a bug fix tha= t I can apply to resolve my issue.=0A= >>>>>> I shared this information with the linux-nfs mailing list and they b= elieve the issue is on the server side.=0A= >>>>>>=0A= >>>>>> Issue=0A= >>>>>> NFSv4 mounts periodically hang on the NFS Client.=0A= >>>>>>=0A= >>>>>> During this time, it is possible to manually mount from another NFS = Server on the NFS Client having issues.=0A= >>>>>> Also, other NFS Clients are successfully mounting from the NFS Serve= r in question.=0A= >>>>>> Rebooting the NFS Client appears to be the only solution.=0A= >>>>>=0A= >>>>> I had experienced a similar weird situation with periodically stuck L= inux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based but = they seem to have there >own nfsd)=0A= >>>> Yes, my understanding is that Isilon uses a proprietary user space nfs= d and=0A= >>>> not the kernel based RPC and nfsd in FreeBSD.=0A= >>>>=0A= >>>>> We=92ve had better luck and we did manage to have packet captures on = both sides >during the issue. The gist of it goes like follows:=0A= >>>>>=0A= >>>>> - Data flows correctly between SERVER and the CLIENT=0A= >>>>> - At some point SERVER starts decreasing it's TCP Receive Window unti= l it reachs 0=0A= >>>>> - The client (eager to send data) can only ack data sent by SERVER.= =0A= >>>>> - When SERVER was done sending data, the client starts sending TCP Wi= ndow >Probes hoping that the TCP Window opens again so he can flush its buf= fers.=0A= >>>>> - SERVER responds with a TCP Zero Window to those probes.=0A= >>>> Having the window size drop to zero is not necessarily incorrect.=0A= >>>> If the server is overloaded (has a backlog of NFS requests), it can st= op doing=0A= >>>> soreceive() on the socket (so the socket rcv buffer can fill up and th= e TCP window=0A= >>>> closes). This results in "backpressure" to stop the NFS client from fl= ooding the=0A= >>>> NFS server with requests.=0A= >>>> --> However, once the backlog is handled, the nfsd should start to sor= eceive()=0A= >>>> again and this shouls cause the window to open back up.=0A= >>>> --> Maybe this is broken in the socket/TCP code. I quickly got lost in= =0A= >>>> tcp_output() when it decides what to do about the rcvwin.=0A= >>>>=0A= >>>>> - After 6 minutes (the NFS server default Idle timeout) SERVER racefu= lly closes the >TCP connection sending a FIN Packet (and still a TCP Window= 0)=0A= >>>> This probably does not happen for Jason's case, since the 6minute time= out=0A= >>>> is disabled when the TCP connection is assigned as a backchannel (most= likely=0A= >>>> the case for NFSv4.1).=0A= >>>>=0A= >>>>> - CLIENT ACK that FIN.=0A= >>>>> - SERVER goes in FIN_WAIT_2 state=0A= >>>>> - CLIENT closes its half part part of the socket and goes in LAST_ACK= state.=0A= >>>>> - FIN is never sent by the client since there still data in its SendQ= and receiver TCP >Window is still 0. At this stage the client starts sendi= ng TCP Window Probes again >and again hoping that the server opens its TCP = Window so it can flush it's buffers >and terminate its side of the socket.= =0A= >>>>> - SERVER keeps responding with a TCP Zero Window to those probes.=0A= >>>>> =3D> The last two steps goes on and on for hours/days freezing the NF= S mount bound >to that TCP session.=0A= >>>>>=0A= >>>>> If we had a situation where CLIENT was responsible for closing the TC= P Window (and >initiating the TCP FIN first) and server wanting to send dat= a we=92ll end up in the same >state as you I think.=0A= >>>>>=0A= >>>>> We=92ve never had the root cause of why the SERVER decided to close t= he TCP >Window and no more acccept data, the fix on the Isilon part was to = recycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwai= t2_recycle=3D1 & >net.inet.tcp.finwait2_timeout=3D5000). Once the socket re= cycled and at the next >occurence of CLIENT TCP Window probe, SERVER sends = a RST, triggering the >teardown of the session on the client side, a new TC= P handchake, etc and traffic >flows again (NFS starts responding)=0A= >>>>>=0A= >>>>> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 w= as >implemented on the Isilon side) we=92ve added a check script on the cli= ent that detects >LAST_ACK sockets on the client and through iptables rule = enforces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr -= -sport $local_port -j REJECT >--reject-with tcp-reset (the script removes t= his iptables rule as soon as the LAST_ACK >disappears)=0A= >>>>>=0A= >>>>> The bottom line would be to have a packet capture during the outage (= client and/or >server side), it will show you at least the shape of the TCP= exchange when NFS is >stuck.=0A= >>>> Interesting story and good work w.r.t. sluething, Youssef, thanks.=0A= >>>>=0A= >>>> I looked at Jason's log and it shows everything is ok w.r.t the nfsd t= hreads.=0A= >>>> (They're just waiting for RPC requests.)=0A= >>>> However, I do now think I know why the soclose() does not happen.=0A= >>>> When the TCP connection is assigned as a backchannel, that takes a ref= erence=0A= >>>> cnt on the structure. This refcnt won't be released until the connecti= on is=0A= >>>> replaced by a BindConnectiotoSession operation from the client. But th= at won't=0A= >>>> happen until the client creates a new TCP connection.=0A= >>>> --> No refcnt release-->no refcnt of 0-->no soclose().=0A= >>>>=0A= >>>> I've created the attached patch (completely different from the previou= s one)=0A= >>>> that adds soshutdown(SHUT_WR) calls in the three places where the TCP= =0A= >>>> connection is going away. This seems to get it past CLOSE_WAIT without= a=0A= >>>> soclose().=0A= >>>> --> I know you are not comfortable with patching your server, but I do= think=0A= >>>> this change will get the socket shutdown to complete.=0A= >>>>=0A= >>>> There are a couple more things you can check on the server...=0A= >>>> # nfsstat -E -s=0A= >>>> --> Look for the count under "BindConnToSes".=0A= >>>> --> If non-zero, backchannels have been assigned=0A= >>>> # sysctl -a | fgrep request_space_throttle_count=0A= >>>> --> If non-zero, the server has been overloaded at some point.=0A= >>>>=0A= >>>> I think the attached patch might work around the problem.=0A= >>>> The code that should open up the receive window needs to be checked.= =0A= >>>> I am also looking at enabling the 6minute timeout when a backchannel i= s=0A= >>>> assigned.=0A= >>>>=0A= >>>> rick=0A= >>>>=0A= >>>> Youssef=0A= >>>>=0A= >>>> _______________________________________________=0A= >>>> freebsd-net@freebsd.org mailing list= =0A= >>>> https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo= /freebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5O= ZciLj4ObR_drJNv5yryaERfIbhKR2d$=0A= >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org<= mailto:freebsd-net-unsubscribe@freebsd.org>"=0A= >>>> =0A= >>>>=0A= >>>> =0A= >>>>=0A= >>>> _______________________________________________=0A= >>>> freebsd-net@freebsd.org mailing list=0A= >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>>> _______________________________________________=0A= >>>> freebsd-net@freebsd.org mailing list=0A= >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>>=0A= >>> _______________________________________________=0A= >>> freebsd-net@freebsd.org mailing list=0A= >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>> _______________________________________________=0A= >>> freebsd-net@freebsd.org mailing list=0A= >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>=0A= >=0A= =0A= _______________________________________________=0A= freebsd-net@freebsd.org mailing list=0A= https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= From owner-freebsd-net@freebsd.org Tue Apr 6 06:21:09 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9EC425D4F00; Tue, 6 Apr 2021 06:21:09 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from hz.grosbein.net (hz.grosbein.net [IPv6:2a01:4f8:c2c:26d8::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hz.grosbein.net", Issuer "hz.grosbein.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDy8P0x2Pz4tPp; Tue, 6 Apr 2021 06:21:08 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13:0:0:0:5]) by hz.grosbein.net (8.15.2/8.15.2) with ESMTPS id 1366KtwU020663 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 6 Apr 2021 06:20:57 GMT (envelope-from eugen@grosbein.net) X-Envelope-From: eugen@grosbein.net X-Envelope-To: freebsd-current@freebsd.org Received: from [10.58.0.10] (dadv@dadvw [10.58.0.10]) by eg.sd.rdtc.ru (8.16.1/8.16.1) with ESMTPS id 1366Kmgb033664 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Tue, 6 Apr 2021 13:20:48 +0700 (+07) (envelope-from eugen@grosbein.net) Subject: Re: TCP Connection hang - MSS again To: Rozhuk Ivan References: <20210405124450.7505b43c@rimwks.local> <0D7C52FC-DA37-41B6-A05C-F49ECEFE51FC@freebsd.org> <20210405154449.2d267589@rimwks.local> Cc: freebsd-current@freebsd.org, freebsd-net From: Eugene Grosbein Message-ID: Date: Tue, 6 Apr 2021 13:20:47 +0700 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20210405154449.2d267589@rimwks.local> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_00,LOCAL_FROM, NICE_REPLY_A,SPF_FAIL,SPF_HELO_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-Report: * -2.3 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * 0.0 SPF_FAIL SPF: sender does not match SPF record (fail) * [SPF failed: Please see http://www.openspf.org/Why?s=mfrom; id=eugen%40grosbein.net; ip=2a03%3A3100%3Ac%3A13%3A%3A5; r=hz.grosbein.net] * 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record * 2.6 LOCAL_FROM From my domains * -0.0 NICE_REPLY_A Looks like a legit reply (A) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on hz.grosbein.net X-Rspamd-Queue-Id: 4FDy8P0x2Pz4tPp X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; TAGGED_RCPT(0.00)[]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 06:21:09 -0000 05.04.2021 19:44, Rozhuk Ivan wrote: >>> As I understand, in some cases remote host does not reply with MSS >>> option, and host behind router continue use mss 8960, that dropped >>> by router. >> If the peer does not provide an MSS option, your local FreeBSD based >> host should use an MSS of net.inet.tcp.mssdflt bytes. The default is >> 536. So I don't think this should be a problem. > > Thats it! > Thanks, it was ~64k in mine config. This is also per-host setting, you know :-) It is generally bad idea using MTU over 1500 for an interface facing public network without -mtu 1500. You see, because TCP MSS affects only TCP and there is also UDP that happily produces oversized datagramms for DNS or RTP or NFS or tunneling like L2TP or OpenVPN etc. relying on IP fragmentation. I still recommend using -mtu 1500 in addition to mssdflt in your case. From owner-freebsd-net@freebsd.org Tue Apr 6 06:50:52 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id ADF1C5D5C11 for ; Tue, 6 Apr 2021 06:50:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FDyph3pG3z3BrH for ; Tue, 6 Apr 2021 06:50:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 7991D5D5E0B; Tue, 6 Apr 2021 06:50:52 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7924F5D5E0A for ; Tue, 6 Apr 2021 06:50:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDyph2fXfz3C9D for ; Tue, 6 Apr 2021 06:50:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4CF5127043 for ; Tue, 6 Apr 2021 06:50:52 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1366oqKm016344 for ; Tue, 6 Apr 2021 06:50:52 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1366oqPj016342 for net@FreeBSD.org; Tue, 6 Apr 2021 06:50:52 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254478] Panic when using ipfw and divert sockets Date: Tue, 06 Apr 2021 06:50:52 +0000 X-Bugzilla-Reason: AssignedTo CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.2-RELEASE X-Bugzilla-Keywords: crash, needs-qa X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: commit-hook@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: net@FreeBSD.org X-Bugzilla-Flags: mfc-stable13? mfc-stable12? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 06:50:52 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254478 --- Comment #6 from commit-hook@FreeBSD.org --- A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3D6b8c65318e81a451b33ed57b84a549528= 4dcb20f commit 6b8c65318e81a451b33ed57b84a5495284dcb20f Author: Andrey V. Elsukov AuthorDate: 2021-03-30 09:31:09 +0000 Commit: Andrey V. Elsukov CommitDate: 2021-04-06 06:47:54 +0000 ipdivert: check that PCB is still valid after taking INPCB_RLOCK. We are inspecting PCBs of divert sockets under NET_EPOCH section, but PCB could be already detached and we should check INP_FREED flag when we took INP_RLOCK. PR: 254478 Differential Revision: https://reviews.freebsd.org/D29420 (cherry picked from commit c80a4b76ceacc5aab322e7ac1407eea8c90cb3b1) sys/netinet/ip_divert.c | 4 ++++ 1 file changed, 4 insertions(+) --=20 You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Tue Apr 6 06:51:53 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DF1465D5E14 for ; Tue, 6 Apr 2021 06:51:53 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FDyqs5XZMz3CBF for ; Tue, 6 Apr 2021 06:51:53 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id BD80E5D5CA1; Tue, 6 Apr 2021 06:51:53 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BD2DB5D5E13 for ; Tue, 6 Apr 2021 06:51:53 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDyqs4YrBz3CDP for ; Tue, 6 Apr 2021 06:51:53 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7EA452704C for ; Tue, 6 Apr 2021 06:51:53 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1366pr11017315 for ; Tue, 6 Apr 2021 06:51:53 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1366prac017314 for net@FreeBSD.org; Tue, 6 Apr 2021 06:51:53 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254478] Panic when using ipfw and divert sockets Date: Tue, 06 Apr 2021 06:51:53 +0000 X-Bugzilla-Reason: CC AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.2-RELEASE X-Bugzilla-Keywords: crash, needs-qa X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: commit-hook@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: net@FreeBSD.org X-Bugzilla-Flags: mfc-stable13? mfc-stable12? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 06:51:54 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254478 --- Comment #7 from commit-hook@FreeBSD.org --- A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3D38c299fe856216d6ab38eb5e04d9ee4f8= c22995d commit 38c299fe856216d6ab38eb5e04d9ee4f8c22995d Author: Andrey V. Elsukov AuthorDate: 2021-03-30 09:31:09 +0000 Commit: Andrey V. Elsukov CommitDate: 2021-04-06 06:50:55 +0000 ipdivert: check that PCB is still valid after taking INPCB_RLOCK. We are inspecting PCBs of divert sockets under NET_EPOCH section, but PCB could be already detached and we should check INP_FREED flag when we took INP_RLOCK. PR: 254478 Differential Revision: https://reviews.freebsd.org/D29420 (cherry picked from commit c80a4b76ceacc5aab322e7ac1407eea8c90cb3b1) sys/netinet/ip_divert.c | 4 ++++ 1 file changed, 4 insertions(+) --=20 You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug.= From owner-freebsd-net@freebsd.org Tue Apr 6 07:12:52 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E9EA35D64E5 for ; Tue, 6 Apr 2021 07:12:52 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FDzJ45Kh7z3D7N for ; Tue, 6 Apr 2021 07:12:52 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:5dce:ac0e:8f11:3285] (unknown [IPv6:2a02:8109:1140:c3d:5dce:ac0e:8f11:3285]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTPSA id 59E6170CE9925; Tue, 6 Apr 2021 09:12:41 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs From: tuexen@freebsd.org In-Reply-To: Date: Tue, 6 Apr 2021 09:12:40 +0200 Cc: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> To: Rick Macklem X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FDzJ45Kh7z3D7N X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 07:12:53 -0000 > On 6. Apr 2021, at 01:24, Rick Macklem wrote: >=20 > tuexen@freebsd.org wrote: > [stuff snipped] >> OK. What is the FreeBSD version you are using? > main Dec. 23, 2020. >=20 >>=20 >> It seems that the TCP connection on the FreeBSD is still alive, >> Linux has decided to start a new TCP connection using the old >> port numbers. So it sends a SYN. The response is a challenge ACK >> and Linux responds with a RST. This looks good so far. However, >> FreeBSD should accept the RST and kill the TCP connection. The >> next SYN from the Linux side would establish a new TCP connection. >>=20 >> So I'm wondering why the RST is not accepted. I made the timestamp >> checking stricter but introduced a bug where RST segments without >> timestamps were ignored. This was fixed. >>=20 >> Introduced in main on 2020/11/09: >> https://svnweb.freebsd.org/changeset/base/367530 >> Introduced in stable/12 on 2020/11/30: >> https://svnweb.freebsd.org/changeset/base/36818 >> Fix in main on 2021/01/13: >> = https://cgit.FreeBSD.org/src/commit/?id=3Dcc3c34859eab1b317d0f38731355b53f= 7d978c97 >> Fix in stable/12 on 2021/01/24: >> = https://cgit.FreeBSD.org/src/commit/?id=3Dd05d908d6d3c85479c84c707f9311484= 39ae826b >>=20 >> Are you using a version which is affected by this bug? > I was. Now I've applied the patch. > Bad News. It did not fix the problem. > It still gets into an endless "ignore RST" and stay established when > the Send-Q is empty. OK. Let us focus on this case. Could you: 1. sudo sysctl net.inet.tcp.log_debug=3D1 2. repeat the situation where RSTs are ignored. 3. check if there is some output on the console (/var/log/messages). 4. Either provide the output or let me know that there is none. Best regards Michael >=20 > If the Send-Q is non-empty when I partition, it recovers fine, > sometimes not even needing to see an RST. >=20 > rick > ps: If you think there might be other recent changes that matter, > just say the word and I'll upgrade to bits de jur. >=20 > rick >=20 > Best regards > Michael >>=20 >> If I wait long enough before healing the partition, it will >> go to FIN_WAIT_1, and then if I plug it back in, it does not >> do battle (at least not for long). >>=20 >> Btw, I have one running now that seems stuck really good. >> It has been 20minutes since I plugged the net cable back in. >> (Unfortunately, I didn't have tcpdump running until after >> I saw it was not progressing after healing. >> --> There is one difference. There was a 6minute timeout >> enabled on the server krpc for "no activity", which is >> now disabled like it is for NFSv4.1 in freebsd-current. >> I had forgotten to re-disable it. >> So, when it does battle, it might have been the 6minute >> timeout, which would then do the soshutdown(..SHUT_WR) >> which kept it from getting "stuck" forever. >> -->This time I had to reboot the FreeBSD NFS server to >> get the Linux client unstuck, so this one looked a lot >> like what has been reported. >> The pcap for this one, started after the network was plugged >> back in and I noticed it was stuck for quite a while is here: >> fetch https://people.freebsd.org/~rmacklem/stuck.pcap >>=20 >> In it, there is just a bunch of RST followed by SYN sent >> from client->FreeBSD and FreeBSD just keeps sending >> acks for the old segment back. >> --> It looks like FreeBSD did the "RST, ACK" after the >> krpc did a soshutdown(..SHUT_WR) on the socket, >> for the one you've been looking at. >> I'll test some more... >>=20 >>> I would like to understand why the reestablishment of the connection >>> did not work... >> It is looking like it takes either a non-empty send-q or a >> soshutdown(..SHUT_WR) to get the FreeBSD socket >> out of established, where it just ignores the RSTs and >> SYN packets. >>=20 >> Thanks for looking at it, rick >>=20 >> Best regards >> Michael >>>=20 >>> Have fun with it, rick >>>=20 >>>=20 >>> ________________________________________ >>> From: tuexen@freebsd.org >>> Sent: Sunday, April 4, 2021 12:41 PM >>> To: Rick Macklem >>> Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org >>> Subject: Re: NFS Mount Hangs >>>=20 >>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>=20 >>>=20 >>>> On 4. Apr 2021, at 17:27, Rick Macklem = wrote: >>>>=20 >>>> Well, I'm going to cheat and top post, since this is elated info. = and >>>> not really part of the discussion... >>>>=20 >>>> I've been testing network partitioning between a Linux client (5.2 = kernel) >>>> and a FreeBSD-current NFS server. I have not gotten a solid hang, = but >>>> I have had the Linux client doing "battle" with the FreeBSD server = for >>>> several minutes after un-partitioning the connection. >>>>=20 >>>> The battle basically consists of the Linux client sending an RST, = followed >>>> by a SYN. >>>> The FreeBSD server ignores the RST and just replies with the same = old ack. >>>> --> This varies from "just a SYN" that succeeds to 100+ cycles of = the above >>>> over several minutes. >>>>=20 >>>> I had thought that an RST was a "pretty heavy hammer", but FreeBSD = seems >>>> pretty good at ignoring it. >>>>=20 >>>> A full packet capture of one of these is in = /home/rmacklem/linuxtofreenfs.pcap >>>> in case anyone wants to look at it. >>> On freefall? I would like to take a look at it... >>>=20 >>> Best regards >>> Michael >>>>=20 >>>> Here's a tcpdump snippet of the interesting part (see the *** = comments): >>>> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 202585:202749, ack = 212293, win 29128, options [nop,nop,TS val 2073636037 ecr 2671204825], = length 164: NFS reply xid 613153685 reply ok 160 getattr NON 4 ids = 0/33554432 sz 0 >>>> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 202749, win 501, options = [nop,nop,TS val 2671204825 ecr 2073636037], length 0 >>>> *** Network is now partitioned... >>>>=20 >>>> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671204927 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205135 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205343 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>> *** Lots of lines snipped. >>>>=20 >>>>=20 >>>> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> *** Network is now unpartitioned... >>>>=20 >>>> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>>> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671421871 ecr 0,nop,wscale 7], length 0 >>>> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073855137 ecr 2671204825], length 0 >>>> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>>> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671424943 ecr 0,nop,wscale 7], length 0 >>>> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073856161 ecr 2671204825], length 0 >>>> *** This "battle" goes on for 223sec... >>>> I snipped out 13 cycles of this "Linux sends an RST, followed by = SYN" >>>> "FreeBSD replies with same old ACK". In another test run I saw this >>>> cycle continue non-stop for several minutes. This time, the Linux >>>> client paused for a while (see ARPs below). >>>>=20 >>>> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>>> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671428975 ecr 0,nop,wscale 7], length 0 >>>> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073860193 ecr 2671204825], length 0 >>>> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. >>>>=20 >>>> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>>> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>>> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>>> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074046846 ecr 2671204825], length 0 >>>> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074062406 ecr 2671204825], length 0 >>>> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [R.], seq 202750, ack 212293, win = 0, options [nop,nop,TS val 2074077981 ecr 2671204825], length 0 >>>> *** FreeBSD finally acknowledges the RST 38sec after Linux sent the = last >>>> of 13 (100+ for another test run). >>>>=20 >>>> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 4247692373, win 64240, options = [mss 1460,sackOK,TS val 2671667055 ecr 0,nop,wscale 7], length 0 >>>> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [S.], seq 661237469, ack = 4247692374, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val = 2074098278 ecr 2671667055], length 0 >>>> *** Now back in business... >>>>=20 >>>> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 1, win 502, options = [nop,nop,TS val 2671667055 ecr 2074098278], length 0 >>>> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 232: NFS = request xid 629930901 228 getattr fh 0,1/53 >>>> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 272: NFS = request xid 697039765 132 getattr fh 0,1/53 >>>> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 505, win 29128, options = [nop,nop,TS val 2074098279 ecr 2671667056], length 0 >>>> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098279], length 136: NFS = request xid 730594197 132 getattr fh 0,1/53 >>>> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 1:49, ack 505, win = 29128, options [nop,nop,TS val 2074098279 ecr 2671667056], length 48: = NFS reply xid 697039765 reply ok 44 getattr ERROR: unk 10063 >>>>=20 >>>> This error 10063 after the partition heals is also "bad news". It = indicates the Session >>>> (which is supposed to maintain "exactly once" RPC semantics is = broken). I'll admit I >>>> suspect a Linux client bug, but will be investigating further. >>>>=20 >>>> So, hopefully TCP conversant folk can confirm if the above is = correct behaviour >>>> or if the RST should be ack'd sooner? >>>>=20 >>>> I could also see this becoming a "forever" TCP battle for other = versions of Linux client. >>>>=20 >>>> rick >>>>=20 >>>>=20 >>>> ________________________________________ >>>> From: Scheffenegger, Richard >>>> Sent: Sunday, April 4, 2021 7:50 AM >>>> To: Rick Macklem; tuexen@freebsd.org >>>> Cc: Youssef GHORBAL; freebsd-net@freebsd.org >>>> Subject: Re: NFS Mount Hangs >>>>=20 >>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>=20 >>>>=20 >>>> For what it=E2=80=98s worth, suse found two bugs in the linux = nfconntrack (stateful firewall), and pfifo-fast scheduler, which could = conspire to make tcp sessions hang forever. >>>>=20 >>>> One is a missed updaten when the c=C3=B6ient is not using the = noresvport moint option, which makes tje firewall think rsts are illegal = (and drop them); >>>>=20 >>>> The fast scheduler can run into an issue if only a single packet = should be forwarded (note that this is not the default scheduler, but = often recommended for perf, as it runs lockless and lower cpu cost that = pfq (default). If no other/additional packet pushes out that last packet = of a flow, it can become stuck forever... >>>>=20 >>>> I can try getting the relevant bug info next week... >>>>=20 >>>> ________________________________ >>>> Von: owner-freebsd-net@freebsd.org = im Auftrag von Rick Macklem >>>> Gesendet: Friday, April 2, 2021 11:31:01 PM >>>> An: tuexen@freebsd.org >>>> Cc: Youssef GHORBAL ; = freebsd-net@freebsd.org >>>> Betreff: Re: NFS Mount Hangs >>>>=20 >>>> NetApp Security WARNING: This is an external email. Do not click = links or open attachments unless you recognize the sender and know the = content is safe. >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>> tuexen@freebsd.org wrote: >>>>>> On 2. Apr 2021, at 02:07, Rick Macklem = wrote: >>>>>>=20 >>>>>> I hope you don't mind a top post... >>>>>> I've been testing network partitioning between the only Linux = client >>>>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch >>>>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >>>>>> applied to it. >>>>>>=20 >>>>>> I'm not enough of a TCP guy to know if this is useful, but here's = what >>>>>> I see... >>>>>>=20 >>>>>> While partitioned: >>>>>> On the FreeBSD server end, the socket either goes to CLOSED = during >>>>>> the network partition or stays ESTABLISHED. >>>>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer = also >>>>> sent a FIN, but you never called close() on the socket. >>>>> If the socket stays in ESTABLISHED, there is no communication = ongoing, >>>>> I guess, and therefore the server does not even detect that the = peer >>>>> is not reachable. >>>>>> On the Linux end, the socket seems to remain ESTABLISHED for a >>>>>> little while, and then disappears. >>>>> So how does Linux detect the peer is not reachable? >>>> Well, here's what I see in a packet capture in the Linux client = once >>>> I partition it (just unplug the net cable): >>>> - lots of retransmits of the same segment (with ACK) for 54sec >>>> - then only ARP queries >>>>=20 >>>> Once I plug the net cable back in: >>>> - ARP works >>>> - one more retransmit of the same segement >>>> - receives RST from FreeBSD >>>> ** So, is this now a "new" TCP connection, despite >>>> using the same port#. >>>> --> It matters for NFS, since "new connection" >>>> implies "must retry all outstanding RPCs". >>>> - sends SYN >>>> - receives SYN, ACK from FreeBSD >>>> --> connection starts working again >>>> Always uses same port#. >>>>=20 >>>> On the FreeBSD server end: >>>> - receives the last retransmit of the segment (with ACK) >>>> - sends RST >>>> - receives SYN >>>> - sends SYN, ACK >>>>=20 >>>> I thought that there was no RST in the capture I looked at >>>> yesterday, so I'm not sure if FreeBSD always sends an RST, >>>> but the Linux client behaviour was the same. (Sent a SYN, etc). >>>> The socket disappears from the Linux "netstat -a" and I >>>> suspect that happens after about 54sec, but I am not sure >>>> about the timing. >>>>=20 >>>>>>=20 >>>>>> After unpartitioning: >>>>>> On the FreeBSD server end, you get another socket showing up at >>>>>> the same port# >>>>>> Active Internet connections (including servers) >>>>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = ESTABLISHED >>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = CLOSED >>>>>>=20 >>>>>> The Linux client shows the same connection ESTABLISHED. >>>> But disappears from "netstat -a" for a while during the = partitioning. >>>>=20 >>>>>> (The mount sometimes reports an error. I haven't looked at packet >>>>>> traces to see if it retries RPCs or why the errors occur.) >>>> I have now done so, as above. >>>>=20 >>>>>> --> However I never get hangs. >>>>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server >>>>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >>>>>> mount starts working again. >>>>>>=20 >>>>>> The most obvious thing is that the Linux client always keeps = using >>>>>> the same port#. (The FreeBSD client will use a different port# = when >>>>>> it does a TCP reconnect after no response from the NFS server for >>>>>> a little while.) >>>>>>=20 >>>>>> What do those TCP conversant think? >>>>> I guess you are you are never calling close() on the socket, for = with >>>>> the connection state is CLOSED. >>>> Ok, that makes sense. For this case the Linux client has not done a >>>> BindConnectionToSession to re-assign the back channel. >>>> I'll have to bug them about this. However, I'll bet they'll answer >>>> that I have to tell them the back channel needs re-assignment >>>> or something like that. >>>>=20 >>>> I am pretty certain they are broken, in that the client needs to >>>> retry all outstanding RPCs. >>>>=20 >>>> For others, here's the long winded version of this that I just >>>> put on the phabricator review: >>>> In the server side kernel RPC, the socket (struct socket *) is in a >>>> structure called SVCXPRT (normally pointed to by "xprt"). >>>> These structures a ref counted and the soclose() is done >>>> when the ref. cnt goes to zero. My understanding is that >>>> "struct socket *" is free'd by soclose() so this cannot be done >>>> before the xprt ref. cnt goes to zero. >>>>=20 >>>> For NFSv4.1/4.2 there is something called a back channel >>>> which means that a "xprt" is used for server->client RPCs, >>>> although the TCP connection is established by the client >>>> to the server. >>>> --> This back channel holds a ref cnt on "xprt" until the >>>>=20 >>>> client re-assigns it to a different TCP connection >>>> via an operation called BindConnectionToSession >>>> and the Linux client is not doing this soon enough, >>>> it appears. >>>>=20 >>>> So, the soclose() is delayed, which is why I think the >>>> TCP connection gets stuck in CLOSE_WAIT and that is >>>> why I've added the soshutdown(..SHUT_WR) calls, >>>> which can happen before the client gets around to >>>> re-assigning the back channel. >>>>=20 >>>> Thanks for your help with this Michael, rick >>>>=20 >>>> Best regards >>>> Michael >>>>>=20 >>>>> rick >>>>> ps: I can capture packets while doing this, if anyone has a use >>>>> for them. >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> ________________________________________ >>>>> From: owner-freebsd-net@freebsd.org = on behalf of Youssef GHORBAL = >>>>> Sent: Saturday, March 27, 2021 6:57 PM >>>>> To: Jason Breitman >>>>> Cc: Rick Macklem; freebsd-net@freebsd.org >>>>> Subject: Re: NFS Mount Hangs >>>>>=20 >>>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> On 27 Mar 2021, at 13:20, Jason Breitman = > = wrote: >>>>>=20 >>>>> The issue happened again so we can say that disabling TSO and LRO = on the NIC did not resolve this issue. >>>>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso = -vlanhwtso >>>>> # ifconfig lagg0 >>>>> lagg0: flags=3D8943 = metric 0 mtu 1500 >>>>> = options=3D8100b8 >>>>>=20 >>>>> We can also say that the sysctl settings did not resolve this = issue. >>>>>=20 >>>>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 >>>>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1 >>>>>=20 >>>>> # sysctl net.inet.tcp.finwait2_timeout=3D1000 >>>>> net.inet.tcp.finwait2_timeout: 60000 -> 1000 >>>>>=20 >>>>> I don=E2=80=99t think those will do anything in your case since = the FIN_WAIT2 are on the client side and those sysctls are for BSD. >>>>> By the way it seems that Linux recycles automatically TCP sessions = in FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) >>>>>=20 >>>>> tcp_fin_timeout (integer; default: 60; since Linux 2.2) >>>>> This specifies how many seconds to wait for a final FIN >>>>> packet before the socket is forcibly closed. This is >>>>> strictly a violation of the TCP specification, but >>>>> required to prevent denial-of-service attacks. In Linux >>>>> 2.2, the default value was 180. >>>>>=20 >>>>> So I don=E2=80=99t get why it stucks in the FIN_WAIT2 state = anyway. >>>>>=20 >>>>> You really need to have a packet capture during the outage (client = and server side) so you=E2=80=99ll get over the wire chat and start = speculating from there. >>>>> No need to capture the beginning of the outage for now. All you = have to do, is run a tcpdump for 10 minutes or so when you notice a = client stuck. >>>>>=20 >>>>> * I have not rebooted the NFS Server nor have I restarted nfsd, = but do not believe that is required as these settings are at the TCP = level and I would expect new sessions to use the updated settings. >>>>>=20 >>>>> The issue occurred after 5 days following a reboot of the client = machines. >>>>> I ran the capture information again to make use of the situation. >>>>>=20 >>>>> #!/bin/sh >>>>>=20 >>>>> while true >>>>> do >>>>> /bin/date >> /tmp/nfs-hang.log >>>>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log >>>>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log >>>>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log >>>>> /bin/sleep 60 >>>>> done >>>>>=20 >>>>>=20 >>>>> On the NFS Server >>>>> Active Internet connections (including servers) >>>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>>> tcp4 0 0 NFS.Server.IP.X.2049 = NFS.Client.IP.X.48286 CLOSE_WAIT >>>>>=20 >>>>> On the NFS Client >>>>> tcp 0 0 NFS.Client.IP.X:48286 = NFS.Server.IP.X:2049 FIN_WAIT2 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> You had also asked for the output below. >>>>>=20 >>>>> # nfsstat -E -s >>>>> BackChannelCtBindConnToSes >>>>> 0 0 >>>>>=20 >>>>> # sysctl vfs.nfsd.request_space_throttle_count >>>>> vfs.nfsd.request_space_throttle_count: 0 >>>>>=20 >>>>> I see that you are testing a patch and I look forward to seeing = the results. >>>>>=20 >>>>>=20 >>>>> Jason Breitman >>>>>=20 >>>>>=20 >>>>> On Mar 21, 2021, at 6:21 PM, Rick Macklem = > wrote: >>>>>=20 >>>>> Youssef GHORBAL = > wrote: >>>>>> Hi Jason, >>>>>>=20 >>>>>>> On 17 Mar 2021, at 18:17, Jason Breitman = > = wrote: >>>>>>>=20 >>>>>>> Please review the details below and let me know if there is a = setting that I should apply to my FreeBSD NFS Server or if there is a = bug fix that I can apply to resolve my issue. >>>>>>> I shared this information with the linux-nfs mailing list and = they believe the issue is on the server side. >>>>>>>=20 >>>>>>> Issue >>>>>>> NFSv4 mounts periodically hang on the NFS Client. >>>>>>>=20 >>>>>>> During this time, it is possible to manually mount from another = NFS Server on the NFS Client having issues. >>>>>>> Also, other NFS Clients are successfully mounting from the NFS = Server in question. >>>>>>> Rebooting the NFS Client appears to be the only solution. >>>>>>=20 >>>>>> I had experienced a similar weird situation with periodically = stuck Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD = based but they seem to have there >own nfsd) >>>>> Yes, my understanding is that Isilon uses a proprietary user space = nfsd and >>>>> not the kernel based RPC and nfsd in FreeBSD. >>>>>=20 >>>>>> We=E2=80=99ve had better luck and we did manage to have packet = captures on both sides >during the issue. The gist of it goes like = follows: >>>>>>=20 >>>>>> - Data flows correctly between SERVER and the CLIENT >>>>>> - At some point SERVER starts decreasing it's TCP Receive Window = until it reachs 0 >>>>>> - The client (eager to send data) can only ack data sent by = SERVER. >>>>>> - When SERVER was done sending data, the client starts sending = TCP Window >Probes hoping that the TCP Window opens again so he can = flush its buffers. >>>>>> - SERVER responds with a TCP Zero Window to those probes. >>>>> Having the window size drop to zero is not necessarily incorrect. >>>>> If the server is overloaded (has a backlog of NFS requests), it = can stop doing >>>>> soreceive() on the socket (so the socket rcv buffer can fill up = and the TCP window >>>>> closes). This results in "backpressure" to stop the NFS client = from flooding the >>>>> NFS server with requests. >>>>> --> However, once the backlog is handled, the nfsd should start to = soreceive() >>>>> again and this shouls cause the window to open back up. >>>>> --> Maybe this is broken in the socket/TCP code. I quickly got = lost in >>>>> tcp_output() when it decides what to do about the rcvwin. >>>>>=20 >>>>>> - After 6 minutes (the NFS server default Idle timeout) SERVER = racefully closes the >TCP connection sending a FIN Packet (and still a = TCP Window 0) >>>>> This probably does not happen for Jason's case, since the 6minute = timeout >>>>> is disabled when the TCP connection is assigned as a backchannel = (most likely >>>>> the case for NFSv4.1). >>>>>=20 >>>>>> - CLIENT ACK that FIN. >>>>>> - SERVER goes in FIN_WAIT_2 state >>>>>> - CLIENT closes its half part part of the socket and goes in = LAST_ACK state. >>>>>> - FIN is never sent by the client since there still data in its = SendQ and receiver TCP >Window is still 0. At this stage the client = starts sending TCP Window Probes again >and again hoping that the server = opens its TCP Window so it can flush it's buffers >and terminate its = side of the socket. >>>>>> - SERVER keeps responding with a TCP Zero Window to those probes. >>>>>> =3D> The last two steps goes on and on for hours/days freezing = the NFS mount bound >to that TCP session. >>>>>>=20 >>>>>> If we had a situation where CLIENT was responsible for closing = the TCP Window (and >initiating the TCP FIN first) and server wanting to = send data we=E2=80=99ll end up in the same >state as you I think. >>>>>>=20 >>>>>> We=E2=80=99ve never had the root cause of why the SERVER decided = to close the TCP >Window and no more acccept data, the fix on the Isilon = part was to recycle more >aggressively the FIN_WAIT_2 sockets = (net.inet.tcp.fast_finwait2_recycle=3D1 & = >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recycled and at = the next >occurence of CLIENT TCP Window probe, SERVER sends a RST, = triggering the >teardown of the session on the client side, a new TCP = handchake, etc and traffic >flows again (NFS starts responding) >>>>>>=20 >>>>>> To avoid rebooting the client (and before the aggressive = FIN_WAIT_2 was >implemented on the Isilon side) we=E2=80=99ve added a = check script on the client that detects >LAST_ACK sockets on the client = and through iptables rule enforces a TCP RST, >Something like: -A OUTPUT = -p tcp -d $nfs_server_addr --sport $local_port -j REJECT >--reject-with = tcp-reset (the script removes this iptables rule as soon as the LAST_ACK = >disappears) >>>>>>=20 >>>>>> The bottom line would be to have a packet capture during the = outage (client and/or >server side), it will show you at least the shape = of the TCP exchange when NFS is >stuck. >>>>> Interesting story and good work w.r.t. sluething, Youssef, thanks. >>>>>=20 >>>>> I looked at Jason's log and it shows everything is ok w.r.t the = nfsd threads. >>>>> (They're just waiting for RPC requests.) >>>>> However, I do now think I know why the soclose() does not happen. >>>>> When the TCP connection is assigned as a backchannel, that takes a = reference >>>>> cnt on the structure. This refcnt won't be released until the = connection is >>>>> replaced by a BindConnectiotoSession operation from the client. = But that won't >>>>> happen until the client creates a new TCP connection. >>>>> --> No refcnt release-->no refcnt of 0-->no soclose(). >>>>>=20 >>>>> I've created the attached patch (completely different from the = previous one) >>>>> that adds soshutdown(SHUT_WR) calls in the three places where the = TCP >>>>> connection is going away. This seems to get it past CLOSE_WAIT = without a >>>>> soclose(). >>>>> --> I know you are not comfortable with patching your server, but = I do think >>>>> this change will get the socket shutdown to complete. >>>>>=20 >>>>> There are a couple more things you can check on the server... >>>>> # nfsstat -E -s >>>>> --> Look for the count under "BindConnToSes". >>>>> --> If non-zero, backchannels have been assigned >>>>> # sysctl -a | fgrep request_space_throttle_count >>>>> --> If non-zero, the server has been overloaded at some point. >>>>>=20 >>>>> I think the attached patch might work around the problem. >>>>> The code that should open up the receive window needs to be = checked. >>>>> I am also looking at enabling the 6minute timeout when a = backchannel is >>>>> assigned. >>>>>=20 >>>>> rick >>>>>=20 >>>>> Youssef >>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing = list >>>>> = https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/fre= ebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZci= Lj4ObR_drJNv5yryaERfIbhKR2d$ >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>> >>>>>=20 >>>>> >>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>=20 >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>=20 >>=20 >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Tue Apr 6 12:54:38 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 660705D06E1; Tue, 6 Apr 2021 12:54:38 +0000 (UTC) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: from gndrsh.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4FF6tP1Z6kz4nxg; Tue, 6 Apr 2021 12:54:36 +0000 (UTC) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: from gndrsh.dnsmgr.net (localhost [127.0.0.1]) by gndrsh.dnsmgr.net (8.13.3/8.13.3) with ESMTP id 136CsSjo005422; Tue, 6 Apr 2021 05:54:28 -0700 (PDT) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: (from freebsd-rwg@localhost) by gndrsh.dnsmgr.net (8.13.3/8.13.3/Submit) id 136CsRZB005421; Tue, 6 Apr 2021 05:54:27 -0700 (PDT) (envelope-from freebsd-rwg) From: "Rodney W. Grimes" Message-Id: <202104061254.136CsRZB005421@gndrsh.dnsmgr.net> Subject: Re: TCP Connection hang - MSS again In-Reply-To: To: Eugene Grosbein Date: Tue, 6 Apr 2021 05:54:27 -0700 (PDT) CC: Rozhuk Ivan , freebsd-current@freebsd.org, freebsd-net X-Mailer: ELM [version 2.4ME+ PL121h (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 4FF6tP1Z6kz4nxg X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of freebsd-rwg@gndrsh.dnsmgr.net has no SPF policy when checking 69.59.192.140) smtp.mailfrom=freebsd-rwg@gndrsh.dnsmgr.net X-Spamd-Result: default: False [-0.10 / 15.00]; RCVD_TLS_LAST(0.00)[]; ARC_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[dnsmgr.net]; AUTH_NA(1.00)[]; NEURAL_SPAM_SHORT(1.00)[1.000]; SPAMHAUS_ZRD(0.00)[69.59.192.140:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[69.59.192.140:from]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_SPF_NA(0.00)[no SPF record]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; ASN(0.00)[asn:13868, ipnet:69.59.192.0/19, country:US]; MAILMAN_DEST(0.00)[freebsd-current,freebsd-net]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 12:54:38 -0000 > 05.04.2021 19:44, Rozhuk Ivan wrote: > > >>> As I understand, in some cases remote host does not reply with MSS > >>> option, and host behind router continue use mss 8960, that dropped > >>> by router. > >> If the peer does not provide an MSS option, your local FreeBSD based > >> host should use an MSS of net.inet.tcp.mssdflt bytes. The default is > >> 536. So I don't think this should be a problem. > > > > Thats it! > > Thanks, it was ~64k in mine config. > > This is also per-host setting, you know :-) > > It is generally bad idea using MTU over 1500 for an interface facing public network > without -mtu 1500. You see, because TCP MSS affects only TCP and there is also UDP > that happily produces oversized datagramms for DNS or RTP or NFS or tunneling like L2TP or OpenVPN etc. > relying on IP fragmentation. > > I still recommend using -mtu 1500 in addition to mssdflt in your case. I do not recommend such a setting. That would defeat any jumbo frame usage locally! The gateway/router that is forwarding packets to the internet connection needs its upstream interface mtu set properly, and configured to properly return icmp need fragement messages on the interfaces towards the internal network. This leaking of jumbo frames to the Internet is almost always caused by blockage of icmp packets internal to a network, and doing that forces one to run on an mtu that is acceptable to the global Internet, a far from optimal situation. -- Rod Grimes rgrimes@freebsd.org From owner-freebsd-net@freebsd.org Tue Apr 6 15:31:21 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 117985D58D7; Tue, 6 Apr 2021 15:31:21 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from hz.grosbein.net (hz.grosbein.net [IPv6:2a01:4f8:c2c:26d8::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hz.grosbein.net", Issuer "hz.grosbein.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FFBMD0kF4z3Phw; Tue, 6 Apr 2021 15:31:19 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13:0:0:0:5]) by hz.grosbein.net (8.15.2/8.15.2) with ESMTPS id 136FV7Jn027738 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 6 Apr 2021 15:31:16 GMT (envelope-from eugen@grosbein.net) X-Envelope-From: eugen@grosbein.net X-Envelope-To: freebsd-rwg@gndrsh.dnsmgr.net Received: from [10.58.0.10] (dadvw [10.58.0.10]) by eg.sd.rdtc.ru (8.16.1/8.16.1) with ESMTPS id 136F1Utm003469 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Tue, 6 Apr 2021 22:01:31 +0700 (+07) (envelope-from eugen@grosbein.net) Subject: Re: TCP Connection hang - MSS again To: "Rodney W. Grimes" References: <202104061254.136CsRZB005421@gndrsh.dnsmgr.net> Cc: freebsd-net , freebsd-current@freebsd.org From: Eugene Grosbein Message-ID: <8d211e78-bccc-47a0-ab91-bfbf5d22911c@grosbein.net> Date: Tue, 6 Apr 2021 22:01:24 +0700 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <202104061254.136CsRZB005421@gndrsh.dnsmgr.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_00,LOCAL_FROM, NICE_REPLY_A,SPF_FAIL,SPF_HELO_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-Report: * -2.3 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * 0.0 SPF_FAIL SPF: sender does not match SPF record (fail) * [SPF failed: Please see http://www.openspf.org/Why?s=mfrom; id=eugen%40grosbein.net; ip=2a03%3A3100%3Ac%3A13%3A%3A5; r=hz.grosbein.net] * 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record * 2.6 LOCAL_FROM From my domains * -0.0 NICE_REPLY_A Looks like a legit reply (A) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on hz.grosbein.net X-Rspamd-Queue-Id: 4FFBMD0kF4z3Phw X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=fail (mx1.freebsd.org: domain of eugen@grosbein.net does not designate 2a01:4f8:c2c:26d8::2 as permitted sender) smtp.mailfrom=eugen@grosbein.net X-Spamd-Result: default: False [-0.10 / 15.00]; MID_RHS_MATCH_FROM(0.00)[]; R_SPF_FAIL(1.00)[-all]; FREEFALL_USER(0.00)[eugen]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[grosbein.net]; ARC_NA(0.00)[]; NEURAL_SPAM_MEDIUM(1.00)[1.000]; SPAMHAUS_ZRD(0.00)[2a01:4f8:c2c:26d8::2:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:4f8:c2c:26d8::2:from]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:24940, ipnet:2a01:4f8::/29, country:DE]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-net,freebsd-current] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 15:31:21 -0000 06.04.2021 19:54, Rodney W. Grimes wrote: >> 05.04.2021 19:44, Rozhuk Ivan wrote: >> >>>>> As I understand, in some cases remote host does not reply with MSS >>>>> option, and host behind router continue use mss 8960, that dropped >>>>> by router. >>>> If the peer does not provide an MSS option, your local FreeBSD based >>>> host should use an MSS of net.inet.tcp.mssdflt bytes. The default is >>>> 536. So I don't think this should be a problem. >>> >>> Thats it! >>> Thanks, it was ~64k in mine config. >> >> This is also per-host setting, you know :-) >> >> It is generally bad idea using MTU over 1500 for an interface facing public network >> without -mtu 1500. You see, because TCP MSS affects only TCP and there is also UDP >> that happily produces oversized datagramms for DNS or RTP or NFS or tunneling like L2TP or OpenVPN etc. >> relying on IP fragmentation. >> >> I still recommend using -mtu 1500 in addition to mssdflt in your case. > > I do not recommend such a setting. That would defeat any jumbo frame usage > locally! Why? Default route should not be used for local delivery. > The gateway/router that is forwarding packets to the internet connection > needs its upstream interface mtu set properly, and configured to properly > return icmp need fragement messages on the interfaces towards the > internal network. This results in extra delays and retransmission during outgoing data transfer, not good. The mechanics is much more fragile than default route's mtu attribute. From owner-freebsd-net@freebsd.org Tue Apr 6 17:02:47 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 074395D7DD9; Tue, 6 Apr 2021 17:02:47 +0000 (UTC) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: from gndrsh.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4FFDNj6Wjpz3rMZ; Tue, 6 Apr 2021 17:02:45 +0000 (UTC) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: from gndrsh.dnsmgr.net (localhost [127.0.0.1]) by gndrsh.dnsmgr.net (8.13.3/8.13.3) with ESMTP id 136H2he0006399; Tue, 6 Apr 2021 10:02:43 -0700 (PDT) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: (from freebsd-rwg@localhost) by gndrsh.dnsmgr.net (8.13.3/8.13.3/Submit) id 136H2hZh006398; Tue, 6 Apr 2021 10:02:43 -0700 (PDT) (envelope-from freebsd-rwg) From: "Rodney W. Grimes" Message-Id: <202104061702.136H2hZh006398@gndrsh.dnsmgr.net> Subject: Re: TCP Connection hang - MSS again In-Reply-To: <8d211e78-bccc-47a0-ab91-bfbf5d22911c@grosbein.net> To: Eugene Grosbein Date: Tue, 6 Apr 2021 10:02:43 -0700 (PDT) CC: "Rodney W. Grimes" , freebsd-net , freebsd-current@freebsd.org X-Mailer: ELM [version 2.4ME+ PL121h (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 4FFDNj6Wjpz3rMZ X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of freebsd-rwg@gndrsh.dnsmgr.net has no SPF policy when checking 69.59.192.140) smtp.mailfrom=freebsd-rwg@gndrsh.dnsmgr.net X-Spamd-Result: default: False [1.90 / 15.00]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[dnsmgr.net]; RBL_DBL_DONT_QUERY_IPS(0.00)[69.59.192.140:from]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(1.00)[1.000]; SPAMHAUS_ZRD(0.00)[69.59.192.140:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_SHORT(1.00)[1.000]; R_SPF_NA(0.00)[no SPF record]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:13868, ipnet:69.59.192.0/19, country:US]; MIME_TRACE(0.00)[0:+]; MAILMAN_DEST(0.00)[freebsd-net,freebsd-current]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 17:02:47 -0000 > 06.04.2021 19:54, Rodney W. Grimes wrote: > >> 05.04.2021 19:44, Rozhuk Ivan wrote: > >> > >>>>> As I understand, in some cases remote host does not reply with MSS > >>>>> option, and host behind router continue use mss 8960, that dropped > >>>>> by router. > >>>> If the peer does not provide an MSS option, your local FreeBSD based > >>>> host should use an MSS of net.inet.tcp.mssdflt bytes. The default is > >>>> 536. So I don't think this should be a problem. > >>> > >>> Thats it! > >>> Thanks, it was ~64k in mine config. > >> > >> This is also per-host setting, you know :-) > >> > >> It is generally bad idea using MTU over 1500 for an interface facing public network > >> without -mtu 1500. You see, because TCP MSS affects only TCP and there is also UDP > >> that happily produces oversized datagramms for DNS or RTP or NFS or tunneling like L2TP or OpenVPN etc. > >> relying on IP fragmentation. > >> > >> I still recommend using -mtu 1500 in addition to mssdflt in your case. > > > > I do not recommend such a setting. That would defeat any jumbo frame usage > > locally! > > Why? Default route should not be used for local delivery. Your right, but we are both making assumptions, I assumed that most likely the only route on the system is the default route, and your assuming that they are running with something more than a default route. > > The gateway/router that is forwarding packets to the internet connection > > needs its upstream interface mtu set properly, and configured to properly > > return icmp need fragement messages on the interfaces towards the > > internal network. > > This results in extra delays and retransmission during outgoing data transfer, not good. > The mechanics is much more fragile than default route's mtu attribute. The delay should be pretty slight, the router is going to return an icmp message, and if configured to do so frag the packets and forward them on, no retransmission would occur as the DF flag is not normally set unless explicitly requested. It still makes no since to me to increase the interface MTU and then crank it back down by using a route MTU. You might as well just leave the MTU alone and not have 2 configurations items more or less doing nothing. -- Rod Grimes rgrimes@freebsd.org From owner-freebsd-net@freebsd.org Tue Apr 6 17:24:41 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8D9025B1122; Tue, 6 Apr 2021 17:24:41 +0000 (UTC) (envelope-from Michael.Tuexen@lurchi.franken.de) Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FFDt041RWz3vFP; Tue, 6 Apr 2021 17:24:40 +0000 (UTC) (envelope-from Michael.Tuexen@lurchi.franken.de) Received: from [IPv6:2a02:8109:1140:c3d:bc46:6e0e:6903:1acf] (unknown [IPv6:2a02:8109:1140:c3d:bc46:6e0e:6903:1acf]) (Authenticated sender: lurchi) by drew.franken.de (Postfix) with ESMTPSA id 0544E789BA500; Tue, 6 Apr 2021 19:24:36 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: TCP Connection hang - MSS again From: Michael Tuexen In-Reply-To: <202104061702.136H2hZh006398@gndrsh.dnsmgr.net> Date: Tue, 6 Apr 2021 19:24:35 +0200 Cc: Eugene Grosbein , freebsd-net , freebsd-current@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <202104061702.136H2hZh006398@gndrsh.dnsmgr.net> To: "Rodney W. Grimes" X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FFDt041RWz3vFP X-Spamd-Bar: ++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of Michael.Tuexen@lurchi.franken.de has no SPF policy when checking 193.175.24.27) smtp.mailfrom=Michael.Tuexen@lurchi.franken.de X-Spamd-Result: default: False [2.30 / 15.00]; RCVD_TLS_ALL(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; MV_CASE(0.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[franken.de]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_MEDIUM(1.00)[1.000]; SPAMHAUS_ZRD(0.00)[193.175.24.27:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[193.175.24.27:from]; AUTH_NA(1.00)[]; NEURAL_SPAM_SHORT(1.00)[1.000]; R_SPF_NA(0.00)[no SPF record]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:680, ipnet:193.174.0.0/15, country:DE]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-current,freebsd-net]; RCVD_IN_DNSWL_LOW(-0.10)[193.175.24.27:from] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 17:24:41 -0000 > On 6. Apr 2021, at 19:02, Rodney W. Grimes = wrote: >=20 >> 06.04.2021 19:54, Rodney W. Grimes wrote: >>>> 05.04.2021 19:44, Rozhuk Ivan wrote: >>>>=20 >>>>>>> As I understand, in some cases remote host does not reply with = MSS >>>>>>> option, and host behind router continue use mss 8960, that = dropped >>>>>>> by router. =20 >>>>>> If the peer does not provide an MSS option, your local FreeBSD = based >>>>>> host should use an MSS of net.inet.tcp.mssdflt bytes. The default = is >>>>>> 536. So I don't think this should be a problem. >>>>>=20 >>>>> Thats it! >>>>> Thanks, it was ~64k in mine config. >>>>=20 >>>> This is also per-host setting, you know :-) >>>>=20 >>>> It is generally bad idea using MTU over 1500 for an interface = facing public network >>>> without -mtu 1500. You see, because TCP MSS affects only TCP and = there is also UDP >>>> that happily produces oversized datagramms for DNS or RTP or NFS or = tunneling like L2TP or OpenVPN etc. >>>> relying on IP fragmentation. >>>>=20 >>>> I still recommend using -mtu 1500 in addition to mssdflt in your = case. >>>=20 >>> I do not recommend such a setting. That would defeat any jumbo = frame usage >>> locally! >>=20 >> Why? Default route should not be used for local delivery. >=20 > Your right, but we are both making assumptions, I assumed that most > likely the only route on the system is the default route, and your > assuming that they are running with something more than a default > route. >=20 >>> The gateway/router that is forwarding packets to the internet = connection >>> needs its upstream interface mtu set properly, and configured to = properly >>> return icmp need fragement messages on the interfaces towards the >>> internal network. >>=20 >> This results in extra delays and retransmission during outgoing data = transfer, not good. >> The mechanics is much more fragile than default route's mtu = attribute. >=20 > The delay should be pretty slight, the router is going to return an > icmp message, and if configured to do so frag the packets and > forward them on, no retransmission would occur as the DF flag > is not normally set unless explicitly requested. 1. Isn't a router either fragmenting a packet and forwarding the fragments or sending back an ICMP packet and dropping the packet? 2. Isn't FreeBSDs TCP implementation setting the DF bit, if net.inet.tcp.path_mtu_discovery is set to 1, which is the default. So it would take one RTT to the router For TCP to react and reduce the = MSS. Best regards Michael >=20 > It still makes no since to me to increase the interface MTU and then > crank it back down by using a route MTU. You might as well just leave > the MTU alone and not have 2 configurations items more or less doing > nothing. >=20 > --=20 > Rod Grimes = rgrimes@freebsd.org > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Tue Apr 6 17:49:13 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 190855B230D; Tue, 6 Apr 2021 17:49:13 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FFFQJ3zZKz4SGd; Tue, 6 Apr 2021 17:49:11 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:bc46:6e0e:6903:1acf] (unknown [IPv6:2a02:8109:1140:c3d:bc46:6e0e:6903:1acf]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTPSA id 985DB78D3F333; Tue, 6 Apr 2021 19:49:09 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: TCP Connection hang - MSS again From: Michael Tuexen In-Reply-To: <202104061702.136H2hZh006398@gndrsh.dnsmgr.net> Date: Tue, 6 Apr 2021 19:49:08 +0200 Cc: Eugene Grosbein , freebsd-net , freebsd-current@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <202104061702.136H2hZh006398@gndrsh.dnsmgr.net> To: "Rodney W. Grimes" X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FFFQJ3zZKz4SGd X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [0.00 / 15.00]; local_wl_from(0.00)[freebsd.org]; ASN(0.00)[asn:680, ipnet:193.174.0.0/15, country:DE] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2021 17:49:13 -0000 > On 6. Apr 2021, at 19:02, Rodney W. Grimes = wrote: >=20 >> 06.04.2021 19:54, Rodney W. Grimes wrote: >>>> 05.04.2021 19:44, Rozhuk Ivan wrote: >>>>=20 >>>>>>> As I understand, in some cases remote host does not reply with = MSS >>>>>>> option, and host behind router continue use mss 8960, that = dropped >>>>>>> by router. =20 >>>>>> If the peer does not provide an MSS option, your local FreeBSD = based >>>>>> host should use an MSS of net.inet.tcp.mssdflt bytes. The default = is >>>>>> 536. So I don't think this should be a problem. >>>>>=20 >>>>> Thats it! >>>>> Thanks, it was ~64k in mine config. >>>>=20 >>>> This is also per-host setting, you know :-) >>>>=20 >>>> It is generally bad idea using MTU over 1500 for an interface = facing public network >>>> without -mtu 1500. You see, because TCP MSS affects only TCP and = there is also UDP >>>> that happily produces oversized datagramms for DNS or RTP or NFS or = tunneling like L2TP or OpenVPN etc. >>>> relying on IP fragmentation. >>>>=20 >>>> I still recommend using -mtu 1500 in addition to mssdflt in your = case. >>>=20 >>> I do not recommend such a setting. That would defeat any jumbo = frame usage >>> locally! >>=20 >> Why? Default route should not be used for local delivery. >=20 > Your right, but we are both making assumptions, I assumed that most > likely the only route on the system is the default route, and your > assuming that they are running with something more than a default > route. >=20 >>> The gateway/router that is forwarding packets to the internet = connection >>> needs its upstream interface mtu set properly, and configured to = properly >>> return icmp need fragement messages on the interfaces towards the >>> internal network. >>=20 >> This results in extra delays and retransmission during outgoing data = transfer, not good. >> The mechanics is much more fragile than default route's mtu = attribute. >=20 > The delay should be pretty slight, the router is going to return an > icmp message, and if configured to do so frag the packets and > forward them on, no retransmission would occur as the DF flag > is not normally set unless explicitly requested. 1. Isn't a router either fragmenting a packet and forwarding the fragments or sending back an ICMP packet and dropping the packet? 2. Isn't FreeBSDs TCP implementation setting the DF bit, if net.inet.tcp.path_mtu_discovery is set to 1, which is the default. So it would take one RTT to the router For TCP to react and reduce the = MSS. Best regards Michael >=20 > It still makes no since to me to increase the interface MTU and then > crank it back down by using a route MTU. You might as well just leave > the MTU alone and not have 2 configurations items more or less doing > nothing. >=20 > --=20 > Rod Grimes = rgrimes@freebsd.org > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Thu Apr 8 22:47:43 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 95F365C2A63 for ; Thu, 8 Apr 2021 22:47:43 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4FGbxp2jDRz4pxB; Thu, 8 Apr 2021 22:47:42 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 4FAEB40020; Fri, 9 Apr 2021 00:47:39 +0200 (CEST) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 3854F40017; Fri, 9 Apr 2021 00:47:39 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on bernadotte.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED, AWL, HTML_MESSAGE autolearn=disabled version=3.4.2 X-Spam-Score: -1.0 Received: from [192.168.1.132] (h-201-113.A785.priv.bahnhof.se [98.128.201.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 0A5E340004; Fri, 9 Apr 2021 00:47:32 +0200 (CEST) From: Peter Eriksson Message-Id: Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs Date: Fri, 9 Apr 2021 00:47:32 +0200 In-Reply-To: Cc: "tuexen@freebsd.org" , "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" To: Rick Macklem References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 4FGbxp2jDRz4pxB X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 130.236.254.3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-3.50 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; ARC_NA(0.00)[]; RCPT_COUNT_FIVE(0.00)[5]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[130.236.254.3:from]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:2843, ipnet:130.236.0.0/16, country:SE]; RCVD_TLS_LAST(0.00)[]; MAILMAN_DEST(0.00)[freebsd-net] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Apr 2021 22:47:43 -0000 Hmmm.. We might have run into the same situation here between some Linux = (CentOS 7.9.2009) clients and our FreeBSD 12.2 nfs servers this evening = when a network switch in between the nfs server and the clients had to = be rebooted causing a network partitioning for a number of minutes.=20 Not every NFS mount froze on the Linux clients but a number of them did. = New connections/new NFS mounts worked fine, but the frozen ones stayed = frozen. They unstuck themself after some time (more than 6 minutes - = more like and hour or two). Unfortunately not logs (no netstat output, not tcpdump) are available = from the clients at this time. We are setting up some monitoring scripts = so if this happens again then I hope we=E2=80=99ll be able to capture = some things=E2=80=A6 I tried to check the NFS server side logs one some of our NFS servers = from around the time of the partition but I=E2=80=99ve yet to find = something of interest unfortunately... Not really helpful but that it self-healed after a (long) while is = interesting I think=E2=80=A6 - Peter > On 6 Apr 2021, at 01:24, Rick Macklem wrote: >=20 > tuexen@freebsd.org wrote: > [stuff snipped] >> OK. What is the FreeBSD version you are using? > main Dec. 23, 2020. >=20 >>=20 >> It seems that the TCP connection on the FreeBSD is still alive, >> Linux has decided to start a new TCP connection using the old >> port numbers. So it sends a SYN. The response is a challenge ACK >> and Linux responds with a RST. This looks good so far. However, >> FreeBSD should accept the RST and kill the TCP connection. The >> next SYN from the Linux side would establish a new TCP connection. >>=20 >> So I'm wondering why the RST is not accepted. I made the timestamp >> checking stricter but introduced a bug where RST segments without >> timestamps were ignored. This was fixed. >>=20 >> Introduced in main on 2020/11/09: >> https://svnweb.freebsd.org/changeset/base/367530 >> Introduced in stable/12 on 2020/11/30: >> https://svnweb.freebsd.org/changeset/base/36818 >> Fix in main on 2021/01/13: >> = https://cgit.FreeBSD.org/src/commit/?id=3Dcc3c34859eab1b317d0f38731355b53f= 7d978c97 >> Fix in stable/12 on 2021/01/24: >> = https://cgit.FreeBSD.org/src/commit/?id=3Dd05d908d6d3c85479c84c707f9311484= 39ae826b >>=20 >> Are you using a version which is affected by this bug? > I was. Now I've applied the patch. > Bad News. It did not fix the problem. > It still gets into an endless "ignore RST" and stay established when > the Send-Q is empty. >=20 > If the Send-Q is non-empty when I partition, it recovers fine, > sometimes not even needing to see an RST. >=20 > rick > ps: If you think there might be other recent changes that matter, > just say the word and I'll upgrade to bits de jur. >=20 > rick >=20 > Best regards > Michael >>=20 >> If I wait long enough before healing the partition, it will >> go to FIN_WAIT_1, and then if I plug it back in, it does not >> do battle (at least not for long). >>=20 >> Btw, I have one running now that seems stuck really good. >> It has been 20minutes since I plugged the net cable back in. >> (Unfortunately, I didn't have tcpdump running until after >> I saw it was not progressing after healing. >> --> There is one difference. There was a 6minute timeout >> enabled on the server krpc for "no activity", which is >> now disabled like it is for NFSv4.1 in freebsd-current. >> I had forgotten to re-disable it. >> So, when it does battle, it might have been the 6minute >> timeout, which would then do the soshutdown(..SHUT_WR) >> which kept it from getting "stuck" forever. >> -->This time I had to reboot the FreeBSD NFS server to >> get the Linux client unstuck, so this one looked a lot >> like what has been reported. >> The pcap for this one, started after the network was plugged >> back in and I noticed it was stuck for quite a while is here: >> fetch https://people.freebsd.org/~rmacklem/stuck.pcap >>=20 >> In it, there is just a bunch of RST followed by SYN sent >> from client->FreeBSD and FreeBSD just keeps sending >> acks for the old segment back. >> --> It looks like FreeBSD did the "RST, ACK" after the >> krpc did a soshutdown(..SHUT_WR) on the socket, >> for the one you've been looking at. >> I'll test some more... >>=20 >>> I would like to understand why the reestablishment of the connection >>> did not work... >> It is looking like it takes either a non-empty send-q or a >> soshutdown(..SHUT_WR) to get the FreeBSD socket >> out of established, where it just ignores the RSTs and >> SYN packets. >>=20 >> Thanks for looking at it, rick >>=20 >> Best regards >> Michael >>>=20 >>> Have fun with it, rick >>>=20 >>>=20 >>> ________________________________________ >>> From: tuexen@freebsd.org >>> Sent: Sunday, April 4, 2021 12:41 PM >>> To: Rick Macklem >>> Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org >>> Subject: Re: NFS Mount Hangs >>>=20 >>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>=20 >>>=20 >>>> On 4. Apr 2021, at 17:27, Rick Macklem = wrote: >>>>=20 >>>> Well, I'm going to cheat and top post, since this is elated info. = and >>>> not really part of the discussion... >>>>=20 >>>> I've been testing network partitioning between a Linux client (5.2 = kernel) >>>> and a FreeBSD-current NFS server. I have not gotten a solid hang, = but >>>> I have had the Linux client doing "battle" with the FreeBSD server = for >>>> several minutes after un-partitioning the connection. >>>>=20 >>>> The battle basically consists of the Linux client sending an RST, = followed >>>> by a SYN. >>>> The FreeBSD server ignores the RST and just replies with the same = old ack. >>>> --> This varies from "just a SYN" that succeeds to 100+ cycles of = the above >>>> over several minutes. >>>>=20 >>>> I had thought that an RST was a "pretty heavy hammer", but FreeBSD = seems >>>> pretty good at ignoring it. >>>>=20 >>>> A full packet capture of one of these is in = /home/rmacklem/linuxtofreenfs.pcap >>>> in case anyone wants to look at it. >>> On freefall? I would like to take a look at it... >>>=20 >>> Best regards >>> Michael >>>>=20 >>>> Here's a tcpdump snippet of the interesting part (see the *** = comments): >>>> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 202585:202749, ack = 212293, win 29128, options [nop,nop,TS val 2073636037 ecr 2671204825], = length 164: NFS reply xid 613153685 reply ok 160 getattr NON 4 ids = 0/33554432 sz 0 >>>> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 202749, win 501, options = [nop,nop,TS val 2671204825 ecr 2073636037], length 0 >>>> *** Network is now partitioned... >>>>=20 >>>> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671204927 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205135 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205343 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>> *** Lots of lines snipped. >>>>=20 >>>>=20 >>>> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> *** Network is now unpartitioned... >>>>=20 >>>> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>>> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671421871 ecr 0,nop,wscale 7], length 0 >>>> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073855137 ecr 2671204825], length 0 >>>> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>>> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671424943 ecr 0,nop,wscale 7], length 0 >>>> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073856161 ecr 2671204825], length 0 >>>> *** This "battle" goes on for 223sec... >>>> I snipped out 13 cycles of this "Linux sends an RST, followed by = SYN" >>>> "FreeBSD replies with same old ACK". In another test run I saw this >>>> cycle continue non-stop for several minutes. This time, the Linux >>>> client paused for a while (see ARPs below). >>>>=20 >>>> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>>> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671428975 ecr 0,nop,wscale 7], length 0 >>>> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073860193 ecr 2671204825], length 0 >>>> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. >>>>=20 >>>> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>>> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>>> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>>> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074046846 ecr 2671204825], length 0 >>>> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074062406 ecr 2671204825], length 0 >>>> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [R.], seq 202750, ack 212293, win = 0, options [nop,nop,TS val 2074077981 ecr 2671204825], length 0 >>>> *** FreeBSD finally acknowledges the RST 38sec after Linux sent the = last >>>> of 13 (100+ for another test run). >>>>=20 >>>> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 4247692373, win 64240, options = [mss 1460,sackOK,TS val 2671667055 ecr 0,nop,wscale 7], length 0 >>>> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [S.], seq 661237469, ack = 4247692374, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val = 2074098278 ecr 2671667055], length 0 >>>> *** Now back in business... >>>>=20 >>>> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 1, win 502, options = [nop,nop,TS val 2671667055 ecr 2074098278], length 0 >>>> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 232: NFS = request xid 629930901 228 getattr fh 0,1/53 >>>> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 272: NFS = request xid 697039765 132 getattr fh 0,1/53 >>>> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 505, win 29128, options = [nop,nop,TS val 2074098279 ecr 2671667056], length 0 >>>> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098279], length 136: NFS = request xid 730594197 132 getattr fh 0,1/53 >>>> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 1:49, ack 505, win = 29128, options [nop,nop,TS val 2074098279 ecr 2671667056], length 48: = NFS reply xid 697039765 reply ok 44 getattr ERROR: unk 10063 >>>>=20 >>>> This error 10063 after the partition heals is also "bad news". It = indicates the Session >>>> (which is supposed to maintain "exactly once" RPC semantics is = broken). I'll admit I >>>> suspect a Linux client bug, but will be investigating further. >>>>=20 >>>> So, hopefully TCP conversant folk can confirm if the above is = correct behaviour >>>> or if the RST should be ack'd sooner? >>>>=20 >>>> I could also see this becoming a "forever" TCP battle for other = versions of Linux client. >>>>=20 >>>> rick >>>>=20 >>>>=20 >>>> ________________________________________ >>>> From: Scheffenegger, Richard >>>> Sent: Sunday, April 4, 2021 7:50 AM >>>> To: Rick Macklem; tuexen@freebsd.org >>>> Cc: Youssef GHORBAL; freebsd-net@freebsd.org >>>> Subject: Re: NFS Mount Hangs >>>>=20 >>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>=20 >>>>=20 >>>> For what it=E2=80=98s worth, suse found two bugs in the linux = nfconntrack (stateful firewall), and pfifo-fast scheduler, which could = conspire to make tcp sessions hang forever. >>>>=20 >>>> One is a missed updaten when the c=C3=B6ient is not using the = noresvport moint option, which makes tje firewall think rsts are illegal = (and drop them); >>>>=20 >>>> The fast scheduler can run into an issue if only a single packet = should be forwarded (note that this is not the default scheduler, but = often recommended for perf, as it runs lockless and lower cpu cost that = pfq (default). If no other/additional packet pushes out that last packet = of a flow, it can become stuck forever... >>>>=20 >>>> I can try getting the relevant bug info next week... >>>>=20 >>>> ________________________________ >>>> Von: owner-freebsd-net@freebsd.org = im Auftrag von Rick Macklem >>>> Gesendet: Friday, April 2, 2021 11:31:01 PM >>>> An: tuexen@freebsd.org >>>> Cc: Youssef GHORBAL ; = freebsd-net@freebsd.org >>>> Betreff: Re: NFS Mount Hangs >>>>=20 >>>> NetApp Security WARNING: This is an external email. Do not click = links or open attachments unless you recognize the sender and know the = content is safe. >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>> tuexen@freebsd.org wrote: >>>>>> On 2. Apr 2021, at 02:07, Rick Macklem = wrote: >>>>>>=20 >>>>>> I hope you don't mind a top post... >>>>>> I've been testing network partitioning between the only Linux = client >>>>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch >>>>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >>>>>> applied to it. >>>>>>=20 >>>>>> I'm not enough of a TCP guy to know if this is useful, but here's = what >>>>>> I see... >>>>>>=20 >>>>>> While partitioned: >>>>>> On the FreeBSD server end, the socket either goes to CLOSED = during >>>>>> the network partition or stays ESTABLISHED. >>>>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer = also >>>>> sent a FIN, but you never called close() on the socket. >>>>> If the socket stays in ESTABLISHED, there is no communication = ongoing, >>>>> I guess, and therefore the server does not even detect that the = peer >>>>> is not reachable. >>>>>> On the Linux end, the socket seems to remain ESTABLISHED for a >>>>>> little while, and then disappears. >>>>> So how does Linux detect the peer is not reachable? >>>> Well, here's what I see in a packet capture in the Linux client = once >>>> I partition it (just unplug the net cable): >>>> - lots of retransmits of the same segment (with ACK) for 54sec >>>> - then only ARP queries >>>>=20 >>>> Once I plug the net cable back in: >>>> - ARP works >>>> - one more retransmit of the same segement >>>> - receives RST from FreeBSD >>>> ** So, is this now a "new" TCP connection, despite >>>> using the same port#. >>>> --> It matters for NFS, since "new connection" >>>> implies "must retry all outstanding RPCs". >>>> - sends SYN >>>> - receives SYN, ACK from FreeBSD >>>> --> connection starts working again >>>> Always uses same port#. >>>>=20 >>>> On the FreeBSD server end: >>>> - receives the last retransmit of the segment (with ACK) >>>> - sends RST >>>> - receives SYN >>>> - sends SYN, ACK >>>>=20 >>>> I thought that there was no RST in the capture I looked at >>>> yesterday, so I'm not sure if FreeBSD always sends an RST, >>>> but the Linux client behaviour was the same. (Sent a SYN, etc). >>>> The socket disappears from the Linux "netstat -a" and I >>>> suspect that happens after about 54sec, but I am not sure >>>> about the timing. >>>>=20 >>>>>>=20 >>>>>> After unpartitioning: >>>>>> On the FreeBSD server end, you get another socket showing up at >>>>>> the same port# >>>>>> Active Internet connections (including servers) >>>>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = ESTABLISHED >>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = CLOSED >>>>>>=20 >>>>>> The Linux client shows the same connection ESTABLISHED. >>>> But disappears from "netstat -a" for a while during the = partitioning. >>>>=20 >>>>>> (The mount sometimes reports an error. I haven't looked at packet >>>>>> traces to see if it retries RPCs or why the errors occur.) >>>> I have now done so, as above. >>>>=20 >>>>>> --> However I never get hangs. >>>>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server >>>>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >>>>>> mount starts working again. >>>>>>=20 >>>>>> The most obvious thing is that the Linux client always keeps = using >>>>>> the same port#. (The FreeBSD client will use a different port# = when >>>>>> it does a TCP reconnect after no response from the NFS server for >>>>>> a little while.) >>>>>>=20 >>>>>> What do those TCP conversant think? >>>>> I guess you are you are never calling close() on the socket, for = with >>>>> the connection state is CLOSED. >>>> Ok, that makes sense. For this case the Linux client has not done a >>>> BindConnectionToSession to re-assign the back channel. >>>> I'll have to bug them about this. However, I'll bet they'll answer >>>> that I have to tell them the back channel needs re-assignment >>>> or something like that. >>>>=20 >>>> I am pretty certain they are broken, in that the client needs to >>>> retry all outstanding RPCs. >>>>=20 >>>> For others, here's the long winded version of this that I just >>>> put on the phabricator review: >>>> In the server side kernel RPC, the socket (struct socket *) is in a >>>> structure called SVCXPRT (normally pointed to by "xprt"). >>>> These structures a ref counted and the soclose() is done >>>> when the ref. cnt goes to zero. My understanding is that >>>> "struct socket *" is free'd by soclose() so this cannot be done >>>> before the xprt ref. cnt goes to zero. >>>>=20 >>>> For NFSv4.1/4.2 there is something called a back channel >>>> which means that a "xprt" is used for server->client RPCs, >>>> although the TCP connection is established by the client >>>> to the server. >>>> --> This back channel holds a ref cnt on "xprt" until the >>>>=20 >>>> client re-assigns it to a different TCP connection >>>> via an operation called BindConnectionToSession >>>> and the Linux client is not doing this soon enough, >>>> it appears. >>>>=20 >>>> So, the soclose() is delayed, which is why I think the >>>> TCP connection gets stuck in CLOSE_WAIT and that is >>>> why I've added the soshutdown(..SHUT_WR) calls, >>>> which can happen before the client gets around to >>>> re-assigning the back channel. >>>>=20 >>>> Thanks for your help with this Michael, rick >>>>=20 >>>> Best regards >>>> Michael >>>>>=20 >>>>> rick >>>>> ps: I can capture packets while doing this, if anyone has a use >>>>> for them. >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> ________________________________________ >>>>> From: owner-freebsd-net@freebsd.org = on behalf of Youssef GHORBAL = >>>>> Sent: Saturday, March 27, 2021 6:57 PM >>>>> To: Jason Breitman >>>>> Cc: Rick Macklem; freebsd-net@freebsd.org >>>>> Subject: Re: NFS Mount Hangs >>>>>=20 >>>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> On 27 Mar 2021, at 13:20, Jason Breitman = > = wrote: >>>>>=20 >>>>> The issue happened again so we can say that disabling TSO and LRO = on the NIC did not resolve this issue. >>>>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso = -vlanhwtso >>>>> # ifconfig lagg0 >>>>> lagg0: flags=3D8943 = metric 0 mtu 1500 >>>>> = options=3D8100b8 >>>>>=20 >>>>> We can also say that the sysctl settings did not resolve this = issue. >>>>>=20 >>>>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 >>>>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1 >>>>>=20 >>>>> # sysctl net.inet.tcp.finwait2_timeout=3D1000 >>>>> net.inet.tcp.finwait2_timeout: 60000 -> 1000 >>>>>=20 >>>>> I don=E2=80=99t think those will do anything in your case since = the FIN_WAIT2 are on the client side and those sysctls are for BSD. >>>>> By the way it seems that Linux recycles automatically TCP sessions = in FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) >>>>>=20 >>>>> tcp_fin_timeout (integer; default: 60; since Linux 2.2) >>>>> This specifies how many seconds to wait for a final FIN >>>>> packet before the socket is forcibly closed. This is >>>>> strictly a violation of the TCP specification, but >>>>> required to prevent denial-of-service attacks. In Linux >>>>> 2.2, the default value was 180. >>>>>=20 >>>>> So I don=E2=80=99t get why it stucks in the FIN_WAIT2 state = anyway. >>>>>=20 >>>>> You really need to have a packet capture during the outage (client = and server side) so you=E2=80=99ll get over the wire chat and start = speculating from there. >>>>> No need to capture the beginning of the outage for now. All you = have to do, is run a tcpdump for 10 minutes or so when you notice a = client stuck. >>>>>=20 >>>>> * I have not rebooted the NFS Server nor have I restarted nfsd, = but do not believe that is required as these settings are at the TCP = level and I would expect new sessions to use the updated settings. >>>>>=20 >>>>> The issue occurred after 5 days following a reboot of the client = machines. >>>>> I ran the capture information again to make use of the situation. >>>>>=20 >>>>> #!/bin/sh >>>>>=20 >>>>> while true >>>>> do >>>>> /bin/date >> /tmp/nfs-hang.log >>>>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log >>>>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log >>>>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log >>>>> /bin/sleep 60 >>>>> done >>>>>=20 >>>>>=20 >>>>> On the NFS Server >>>>> Active Internet connections (including servers) >>>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>>> tcp4 0 0 NFS.Server.IP.X.2049 = NFS.Client.IP.X.48286 CLOSE_WAIT >>>>>=20 >>>>> On the NFS Client >>>>> tcp 0 0 NFS.Client.IP.X:48286 = NFS.Server.IP.X:2049 FIN_WAIT2 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> You had also asked for the output below. >>>>>=20 >>>>> # nfsstat -E -s >>>>> BackChannelCtBindConnToSes >>>>> 0 0 >>>>>=20 >>>>> # sysctl vfs.nfsd.request_space_throttle_count >>>>> vfs.nfsd.request_space_throttle_count: 0 >>>>>=20 >>>>> I see that you are testing a patch and I look forward to seeing = the results. >>>>>=20 >>>>>=20 >>>>> Jason Breitman >>>>>=20 >>>>>=20 >>>>> On Mar 21, 2021, at 6:21 PM, Rick Macklem = > wrote: >>>>>=20 >>>>> Youssef GHORBAL = > wrote: >>>>>> Hi Jason, >>>>>>=20 >>>>>>> On 17 Mar 2021, at 18:17, Jason Breitman = > = wrote: >>>>>>>=20 >>>>>>> Please review the details below and let me know if there is a = setting that I should apply to my FreeBSD NFS Server or if there is a = bug fix that I can apply to resolve my issue. >>>>>>> I shared this information with the linux-nfs mailing list and = they believe the issue is on the server side. >>>>>>>=20 >>>>>>> Issue >>>>>>> NFSv4 mounts periodically hang on the NFS Client. >>>>>>>=20 >>>>>>> During this time, it is possible to manually mount from another = NFS Server on the NFS Client having issues. >>>>>>> Also, other NFS Clients are successfully mounting from the NFS = Server in question. >>>>>>> Rebooting the NFS Client appears to be the only solution. >>>>>>=20 >>>>>> I had experienced a similar weird situation with periodically = stuck Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD = based but they seem to have there >own nfsd) >>>>> Yes, my understanding is that Isilon uses a proprietary user space = nfsd and >>>>> not the kernel based RPC and nfsd in FreeBSD. >>>>>=20 >>>>>> We=E2=80=99ve had better luck and we did manage to have packet = captures on both sides >during the issue. The gist of it goes like = follows: >>>>>>=20 >>>>>> - Data flows correctly between SERVER and the CLIENT >>>>>> - At some point SERVER starts decreasing it's TCP Receive Window = until it reachs 0 >>>>>> - The client (eager to send data) can only ack data sent by = SERVER. >>>>>> - When SERVER was done sending data, the client starts sending = TCP Window >Probes hoping that the TCP Window opens again so he can = flush its buffers. >>>>>> - SERVER responds with a TCP Zero Window to those probes. >>>>> Having the window size drop to zero is not necessarily incorrect. >>>>> If the server is overloaded (has a backlog of NFS requests), it = can stop doing >>>>> soreceive() on the socket (so the socket rcv buffer can fill up = and the TCP window >>>>> closes). This results in "backpressure" to stop the NFS client = from flooding the >>>>> NFS server with requests. >>>>> --> However, once the backlog is handled, the nfsd should start to = soreceive() >>>>> again and this shouls cause the window to open back up. >>>>> --> Maybe this is broken in the socket/TCP code. I quickly got = lost in >>>>> tcp_output() when it decides what to do about the rcvwin. >>>>>=20 >>>>>> - After 6 minutes (the NFS server default Idle timeout) SERVER = racefully closes the >TCP connection sending a FIN Packet (and still a = TCP Window 0) >>>>> This probably does not happen for Jason's case, since the 6minute = timeout >>>>> is disabled when the TCP connection is assigned as a backchannel = (most likely >>>>> the case for NFSv4.1). >>>>>=20 >>>>>> - CLIENT ACK that FIN. >>>>>> - SERVER goes in FIN_WAIT_2 state >>>>>> - CLIENT closes its half part part of the socket and goes in = LAST_ACK state. >>>>>> - FIN is never sent by the client since there still data in its = SendQ and receiver TCP >Window is still 0. At this stage the client = starts sending TCP Window Probes again >and again hoping that the server = opens its TCP Window so it can flush it's buffers >and terminate its = side of the socket. >>>>>> - SERVER keeps responding with a TCP Zero Window to those probes. >>>>>> =3D> The last two steps goes on and on for hours/days freezing = the NFS mount bound >to that TCP session. >>>>>>=20 >>>>>> If we had a situation where CLIENT was responsible for closing = the TCP Window (and >initiating the TCP FIN first) and server wanting to = send data we=E2=80=99ll end up in the same >state as you I think. >>>>>>=20 >>>>>> We=E2=80=99ve never had the root cause of why the SERVER decided = to close the TCP >Window and no more acccept data, the fix on the Isilon = part was to recycle more >aggressively the FIN_WAIT_2 sockets = (net.inet.tcp.fast_finwait2_recycle=3D1 & = >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recycled and at = the next >occurence of CLIENT TCP Window probe, SERVER sends a RST, = triggering the >teardown of the session on the client side, a new TCP = handchake, etc and traffic >flows again (NFS starts responding) >>>>>>=20 >>>>>> To avoid rebooting the client (and before the aggressive = FIN_WAIT_2 was >implemented on the Isilon side) we=E2=80=99ve added a = check script on the client that detects >LAST_ACK sockets on the client = and through iptables rule enforces a TCP RST, >Something like: -A OUTPUT = -p tcp -d $nfs_server_addr --sport $local_port -j REJECT >--reject-with = tcp-reset (the script removes this iptables rule as soon as the LAST_ACK = >disappears) >>>>>>=20 >>>>>> The bottom line would be to have a packet capture during the = outage (client and/or >server side), it will show you at least the shape = of the TCP exchange when NFS is >stuck. >>>>> Interesting story and good work w.r.t. sluething, Youssef, thanks. >>>>>=20 >>>>> I looked at Jason's log and it shows everything is ok w.r.t the = nfsd threads. >>>>> (They're just waiting for RPC requests.) >>>>> However, I do now think I know why the soclose() does not happen. >>>>> When the TCP connection is assigned as a backchannel, that takes a = reference >>>>> cnt on the structure. This refcnt won't be released until the = connection is >>>>> replaced by a BindConnectiotoSession operation from the client. = But that won't >>>>> happen until the client creates a new TCP connection. >>>>> --> No refcnt release-->no refcnt of 0-->no soclose(). >>>>>=20 >>>>> I've created the attached patch (completely different from the = previous one) >>>>> that adds soshutdown(SHUT_WR) calls in the three places where the = TCP >>>>> connection is going away. This seems to get it past CLOSE_WAIT = without a >>>>> soclose(). >>>>> --> I know you are not comfortable with patching your server, but = I do think >>>>> this change will get the socket shutdown to complete. >>>>>=20 >>>>> There are a couple more things you can check on the server... >>>>> # nfsstat -E -s >>>>> --> Look for the count under "BindConnToSes". >>>>> --> If non-zero, backchannels have been assigned >>>>> # sysctl -a | fgrep request_space_throttle_count >>>>> --> If non-zero, the server has been overloaded at some point. >>>>>=20 >>>>> I think the attached patch might work around the problem. >>>>> The code that should open up the receive window needs to be = checked. >>>>> I am also looking at enabling the 6minute timeout when a = backchannel is >>>>> assigned. >>>>>=20 >>>>> rick >>>>>=20 >>>>> Youssef >>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing = list >>>>> = https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/fre= ebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZci= Lj4ObR_drJNv5yryaERfIbhKR2d$ >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>> >>>>>=20 >>>>> >>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>=20 >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>=20 >>=20 >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net = > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org = " From owner-freebsd-net@freebsd.org Thu Apr 8 23:37:12 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5169C5C43F3 for ; Thu, 8 Apr 2021 23:37:12 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-eopbgr670070.outbound.protection.outlook.com [40.107.67.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FGd2w0nStz4tF5; Thu, 8 Apr 2021 23:37:11 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=P7Z6BMOz8ZFkSoxutk6HR41ZjVHjw2e7ecl50AxJM5dsP7VQkdNnh3S24mVXA/JrGyMOq7X7lHBsb5jeYCDbINnoF1zd3GVvxk/6+tkLRIvPMyf26RezRcmD4Z1sx9N3BRSoWPG5TtXoFVmb3TiIcQ6YUEJc1W7GkVvzlzmO0fexVNXvYM0tKMOKQSgd8EamuuC0aRiOERJ+z0tx2/qIEte5bsO8y9evF6QqvJIB67S/vn2WIcWBV99c+MsDhfKq1Qp/zYizy1f6NfMvPdN8h5R5aXQ6rY6yIAxr/T0esE0MN+I63qbTB+no+09bF41axMJ6s0o4yM3/v3sPsN4lmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KlWJ7M+TibediL2bd5h0j6HzvsQ/Gcp+FEZG7Wfu+II=; b=KAiuAEvkoL9pxaSOkRa+zDSAN7OCdprgRtKmXuhskIbljV1I/WOxfrmLsNROQlnDJ2/h0lniO0MBy7HS7sjLnddxKkZrgFbs/mFQT3CfBE8Bs9zox6A7nUusgssW0maCj5ay3yD57jYXBtBrQlVG5mc7r5458K2ew2B+PwANsnN01V/MXFYe9WRnw2s6qgjbQ81jqyn2q58injQ/to1pQBy2SHVlao4i91AOmXIFA10YwwOXE78rzO7SoqDcthX6J+fLXvScDWpxVYNDUOUAU04MyT4umRKBwpDbOYox4Cq6DOKKj4CLiAyKS3WeD5+8OQdenlRijXAOxuSLwGfGLA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KlWJ7M+TibediL2bd5h0j6HzvsQ/Gcp+FEZG7Wfu+II=; b=ZD2nux30slS3PTmDJS+Pouw277yYGPXqc54S10DPFnEDqON1fg9mQpZD2oyn0CMyV0f4xEv5AsKgWZ8jn2bRPef7PBX3myfBdrzud+yxlEYfJ4wWBnn9Hnf5TwoLVy918FPQy7h4FP9MO4WRpKXS55j96X2Jkr+fNPCaNzSORpe2yWQcMkgZEDmlmWIVIbgC5zBVlxe1ojwjL3IhXeZF1+fg4eG0K8w7KdLtVOfdtRdVuBsQ2DG9ZW0X2izFiV3DxQepsjJq32Caq5lkxkFmb2ZrIgttWLMMYYPBcNIyEh9BfWQq8DXLGa4BMGeC6x7x4dsdMniH2kUBfijLMDooCg== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by QB1PR01MB3474.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:3d::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.32; Thu, 8 Apr 2021 23:37:09 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.035; Thu, 8 Apr 2021 23:37:09 +0000 From: Rick Macklem To: Peter Eriksson CC: "tuexen@freebsd.org" , "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAG5EAgAA+DUKAAB1JAIAACEqkgAEOcgCAAI4UZoAEr18AgAAKHAg= Date: Thu, 8 Apr 2021 23:37:09 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 03760eee-1638-47ee-1b6d-08d8fae73a3e x-ms-traffictypediagnostic: QB1PR01MB3474: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:2399; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: xOPP6+aHF0evHLsy5o64InWZe0aYymvVtFai6ZbwX7g+niM2M7DPYDEdTl2UjoN1veV6d4RUKsOSR5bkl3p1n5Z8UxZDO0/cgtUdw90pf86AnSmV7iNN1FC53AbA8DwucsnChrDkigfeXyNPDmR6lGeo5DcBCT1vm3L6ly3wl1IJSU0Ye1zBs1yxv0hkgm1pQLjU+HYIXp/jyzwMqzvKyxz9Hvk6KWZkJFYBp2Dgz6s4vkzzHUGoV3zaG4wDTxOiypgYpvNODh0v/7EcuORuGXywQMQVQ1XDVGrCB1qiRa5zxc3eX1fCcakdlrS4XL5CpMQAhVpZtimLru305uM/6L2dm9v1hNrwDXye7n9IS+G9+t6aHkaHEuqLv58f9I3N7qIDY7JcRusyvOpaS9TCi/3N/S7PD6mPFIAFh5XekWr7ZfgAcPxFruXGAeaNuCBPCXJX8iq4LRPp9Y80TIMeMjs969o4AlZ/67Cwwo02burWBd+KHznOE+OHzw/shuTPLoY3uuGTHDI3xGpFg1VbwuZ8taagyuyeZK139raLIvzj3VLHbQUSewyMrOWv1JNIqq+G3zh81BJOE3c/5/USIWoxqDnElBXHekpGrVWkY1hipyhU+6ONm+L9MbwMcLNltJ56YqIQ/v8lZ3yS2IKNzNzGnf6U/SOG/IUMlPPcDufDb8hgd/X4fBQxSYJ+RKoXbyol+9dIwkCPC5X08mWouy74g9vch3GHR7KoqlZohRI= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(136003)(376002)(396003)(39860400002)(366004)(6506007)(966005)(7696005)(53546011)(52536014)(3480700007)(55016002)(8936002)(2906002)(296002)(6916009)(9686003)(54906003)(5660300002)(316002)(786003)(66446008)(30864003)(64756008)(66556008)(91956017)(4326008)(66946007)(71200400001)(76116006)(33656002)(86362001)(38100700001)(66574015)(83380400001)(478600001)(7116003)(8676002)(186003)(66476007)(559001)(579004); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?IxejH3HTxsjJSI8cy06VIelkN1Tr+aoDHTmqhFIZhNFFbntcMw7nDJYt?= =?Windows-1252?Q?dtrxxuO/yDLw60FZ7Numh/7vkg5B18gRgmCFlgMbpKoGNgKs6KTvhNC4?= =?Windows-1252?Q?VDUGGfLldphMIb17teLvvllZmt9dboxc27di+xIKWm888/CCVnwWuyvn?= =?Windows-1252?Q?gGDwnRAIyhePQg2rtWA7f/LANeCBO3oBrFuXcV07dCvITqja38D6mbZk?= =?Windows-1252?Q?ia0DdWCgFCpQulVPiAyqi2lEzLx+sEKvcWAASb8QDJFGLu8VOALwZNu/?= =?Windows-1252?Q?KnjeWqIXXbv4pqaQp6WwZARBLPaZB6nzbvdJV0T0KIbn1pPObbIJseUA?= =?Windows-1252?Q?Ipr4bV428+eQIVsES+gOH3/yPQkdj5RYePgh43dISyvPxgSoZ4jmFK/K?= =?Windows-1252?Q?kUHHTfvkEERfGx3I5G0dHHWhyzPJI8G7LfyfmLiLoQNmqu2ZPRyfVYH3?= =?Windows-1252?Q?uMYzylCQkJRxYsKMkz6RSI95glATEkBwY+zWDy2n1cHhdfRGNmTYuz9I?= =?Windows-1252?Q?RmC4bai12aTSZgRkkjnS+omnFMnHIhQaK9txMRCFkOSGCf0pJrSlBmxu?= =?Windows-1252?Q?7ozRPHGxRKoaeFbb1oPDkiRmSpMeAzUEjfPWqSMzNmY54RfGqpMO4o7V?= =?Windows-1252?Q?d3707ulm0uURqjbSmaUNgfo16Ydfx6sObT0M4nfyMeQudr3WIhbr6zCZ?= =?Windows-1252?Q?GU6fiyakZHDIO7Oi06m1RlRKrc0ETU4KDUDa9h9fwoPyNsbu0WQIlWl3?= =?Windows-1252?Q?RQCJyM/uTnSWkEUvd1hi9/Cs6yqpKk+8+ENeUtfkbHWTwfp6FxKcaKXc?= =?Windows-1252?Q?NaLAoPdOd37+wytggRRbxUke1oInJ2RSdDokNnE2YQRyj8eNggcj6/Gj?= =?Windows-1252?Q?udB78dHRSZHzbrqVS2IZntBsFDg8/13UhIQgmLJKJgYqF3I0j+l9nHSg?= =?Windows-1252?Q?YnSnDFrqCT0BiBlZULXLrYXUwpAgb8ZQNZIt7E4kxvdbFy/2gsAGgXm6?= =?Windows-1252?Q?4Xnlvc6GdJkxzqCCiksp0vgX5qZAN3fboWkEo9q0v1pPh56MOHGuz4bu?= =?Windows-1252?Q?cIb9V3OeQPwHk7mKHX2yK4R0l6GjfnquHdAm3VIqAZVskrgdAqu90Xgd?= =?Windows-1252?Q?05Mf4zTPqQeoYPdVoJSm9JlzfXcevCAklTYD033QhI2atEsY+stFIqh8?= =?Windows-1252?Q?Iold8H6ZVnatPaxZEVvaG+dZxvyEaEJ3vgoB6xob+0DuGfQmVMJrpqlb?= =?Windows-1252?Q?BSUC9Kx1SlK89B4wZp2FNLY0Rr0Uy4iRcn5A8BdOc+vDe6Lu9SWM6qNC?= =?Windows-1252?Q?poHJJ0eAzxWaoULNgPMc2i4wXHiidO3ua//hEE1jxPVhzzS0cp6ujYBA?= =?Windows-1252?Q?W6b+qP2VuTiuiUSxsDYXufz1cgnP5g3xWrH0zrtgBpt3zrFNKYiWe998?= =?Windows-1252?Q?AId+HVU7Yete1CWImduZk0pyRZJKmB1IDuBb1xIVeWQ=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 03760eee-1638-47ee-1b6d-08d8fae73a3e X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Apr 2021 23:37:09.1255 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: LxZo2gX9qEd7N8XSgdlML3jwoHAivOQ4kRcFaB1ODhFUTidowWclF68PMJ1nMKsj9YmmtqCVgzcSLrizGDfVkA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB3474 X-Rspamd-Queue-Id: 4FGd2w0nStz4tF5 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Apr 2021 23:37:12 -0000 Peter Eriksson wrote:=0A= >Hmmm..=0A= >=0A= >We might have run into the same situation here between some Linux (CentOS = >7.9.2009) clients and our FreeBSD 12.2 nfs servers this evening when a net= work switch >in between the nfs server and the clients had to be rebooted c= ausing a network >partitioning for a number of minutes.=0A= >=0A= >Not every NFS mount froze on the Linux clients but a number of them did. N= ew >connections/new NFS mounts worked fine, but the frozen ones stayed froz= en. They >unstuck themself after some time (more than 6 minutes - more like= and hour or two).=0A= The 6minute timeout is disabled for 4.1/4.2 mounts.=0A= I think I did that over concerns w.r.t. maintaining the back channel.=0A= I am planning on enabling it soon.=0A= The 2nd/3rd attachments in PR#254590 do this (same patch, but for FreeBSD13= =0A= vs FreeBSD12.=0A= --> I also have patches in PR#254816 to fix recovery issues that can occur= =0A= after the partition heals.=0A= =0A= I plan on making a post to freebsd-stable@ soon that summarizes the=0A= patches. Doing the network partitioning testing has resulted in a bunch=0A= of them, although most only matter if you have delegations enabled.=0A= =0A= >Unfortunately not logs (no netstat output, not tcpdump) are available from= the clients >at this time. We are setting up some monitoring scripts so if= this happens again then I >hope we=92ll be able to capture some things=85= =0A= >=0A= >I tried to check the NFS server side logs one some of our NFS servers from= around the >time of the partition but I=92ve yet to find something of inte= rest unfortunately...=0A= There is nothing sent to the console/syslog when a client creates a new TCP= =0A= connection. Every normal mount does them and the reconnects do not look=0A= any different to the kernel RPC.=0A= =0A= If it happens again, it would be nice to at least monitor "netstat -a" on t= he=0A= servers, to see what state the connections are in.=0A= =0A= >Not really helpful but that it self-healed after a (long) while is interes= ting I think=85=0A= What's that saying "patience is a virtue". I have no idea what could take h= ours=0A= to get resolved?=0A= =0A= rick=0A= =0A= - Peter=0A= =0A= > On 6 Apr 2021, at 01:24, Rick Macklem wrote:=0A= >=0A= > tuexen@freebsd.org wrote:=0A= > [stuff snipped]=0A= >> OK. What is the FreeBSD version you are using?=0A= > main Dec. 23, 2020.=0A= >=0A= >>=0A= >> It seems that the TCP connection on the FreeBSD is still alive,=0A= >> Linux has decided to start a new TCP connection using the old=0A= >> port numbers. So it sends a SYN. The response is a challenge ACK=0A= >> and Linux responds with a RST. This looks good so far. However,=0A= >> FreeBSD should accept the RST and kill the TCP connection. The=0A= >> next SYN from the Linux side would establish a new TCP connection.=0A= >>=0A= >> So I'm wondering why the RST is not accepted. I made the timestamp=0A= >> checking stricter but introduced a bug where RST segments without=0A= >> timestamps were ignored. This was fixed.=0A= >>=0A= >> Introduced in main on 2020/11/09:=0A= >> https://svnweb.freebsd.org/changeset/base/367530=0A= >> Introduced in stable/12 on 2020/11/30:=0A= >> https://svnweb.freebsd.org/changeset/base/36818=0A= >> Fix in main on 2021/01/13:=0A= >> https://cgit.FreeBSD.org/src/commit/?id=3Dcc3c34859eab1b317d0f38731355b5= 3f7d978c97=0A= >> Fix in stable/12 on 2021/01/24:=0A= >> https://cgit.FreeBSD.org/src/commit/?id=3Dd05d908d6d3c85479c84c707f93114= 8439ae826b=0A= >>=0A= >> Are you using a version which is affected by this bug?=0A= > I was. Now I've applied the patch.=0A= > Bad News. It did not fix the problem.=0A= > It still gets into an endless "ignore RST" and stay established when=0A= > the Send-Q is empty.=0A= >=0A= > If the Send-Q is non-empty when I partition, it recovers fine,=0A= > sometimes not even needing to see an RST.=0A= >=0A= > rick=0A= > ps: If you think there might be other recent changes that matter,=0A= > just say the word and I'll upgrade to bits de jur.=0A= >=0A= > rick=0A= >=0A= > Best regards=0A= > Michael=0A= >>=0A= >> If I wait long enough before healing the partition, it will=0A= >> go to FIN_WAIT_1, and then if I plug it back in, it does not=0A= >> do battle (at least not for long).=0A= >>=0A= >> Btw, I have one running now that seems stuck really good.=0A= >> It has been 20minutes since I plugged the net cable back in.=0A= >> (Unfortunately, I didn't have tcpdump running until after=0A= >> I saw it was not progressing after healing.=0A= >> --> There is one difference. There was a 6minute timeout=0A= >> enabled on the server krpc for "no activity", which is=0A= >> now disabled like it is for NFSv4.1 in freebsd-current.=0A= >> I had forgotten to re-disable it.=0A= >> So, when it does battle, it might have been the 6minute=0A= >> timeout, which would then do the soshutdown(..SHUT_WR)=0A= >> which kept it from getting "stuck" forever.=0A= >> -->This time I had to reboot the FreeBSD NFS server to=0A= >> get the Linux client unstuck, so this one looked a lot=0A= >> like what has been reported.=0A= >> The pcap for this one, started after the network was plugged=0A= >> back in and I noticed it was stuck for quite a while is here:=0A= >> fetch https://people.freebsd.org/~rmacklem/stuck.pcap=0A= >>=0A= >> In it, there is just a bunch of RST followed by SYN sent=0A= >> from client->FreeBSD and FreeBSD just keeps sending=0A= >> acks for the old segment back.=0A= >> --> It looks like FreeBSD did the "RST, ACK" after the=0A= >> krpc did a soshutdown(..SHUT_WR) on the socket,=0A= >> for the one you've been looking at.=0A= >> I'll test some more...=0A= >>=0A= >>> I would like to understand why the reestablishment of the connection=0A= >>> did not work...=0A= >> It is looking like it takes either a non-empty send-q or a=0A= >> soshutdown(..SHUT_WR) to get the FreeBSD socket=0A= >> out of established, where it just ignores the RSTs and=0A= >> SYN packets.=0A= >>=0A= >> Thanks for looking at it, rick=0A= >>=0A= >> Best regards=0A= >> Michael=0A= >>>=0A= >>> Have fun with it, rick=0A= >>>=0A= >>>=0A= >>> ________________________________________=0A= >>> From: tuexen@freebsd.org =0A= >>> Sent: Sunday, April 4, 2021 12:41 PM=0A= >>> To: Rick Macklem=0A= >>> Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org=0A= >>> Subject: Re: NFS Mount Hangs=0A= >>>=0A= >>> CAUTION: This email originated from outside of the University of Guelph= . Do not click links or open attachments unless you recognize the sender an= d know the content is safe. If in doubt, forward suspicious emails to IThel= p@uoguelph.ca=0A= >>>=0A= >>>=0A= >>>> On 4. Apr 2021, at 17:27, Rick Macklem wrote:= =0A= >>>>=0A= >>>> Well, I'm going to cheat and top post, since this is elated info. and= =0A= >>>> not really part of the discussion...=0A= >>>>=0A= >>>> I've been testing network partitioning between a Linux client (5.2 ker= nel)=0A= >>>> and a FreeBSD-current NFS server. I have not gotten a solid hang, but= =0A= >>>> I have had the Linux client doing "battle" with the FreeBSD server for= =0A= >>>> several minutes after un-partitioning the connection.=0A= >>>>=0A= >>>> The battle basically consists of the Linux client sending an RST, foll= owed=0A= >>>> by a SYN.=0A= >>>> The FreeBSD server ignores the RST and just replies with the same old = ack.=0A= >>>> --> This varies from "just a SYN" that succeeds to 100+ cycles of the = above=0A= >>>> over several minutes.=0A= >>>>=0A= >>>> I had thought that an RST was a "pretty heavy hammer", but FreeBSD see= ms=0A= >>>> pretty good at ignoring it.=0A= >>>>=0A= >>>> A full packet capture of one of these is in /home/rmacklem/linuxtofree= nfs.pcap=0A= >>>> in case anyone wants to look at it.=0A= >>> On freefall? I would like to take a look at it...=0A= >>>=0A= >>> Best regards=0A= >>> Michael=0A= >>>>=0A= >>>> Here's a tcpdump snippet of the interesting part (see the *** comments= ):=0A= >>>> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [P.], seq 202585:202749, ack 212293, win 29128, options [no= p,nop,TS val 2073636037 ecr 2671204825], length 164: NFS reply xid 61315368= 5 reply ok 160 getattr NON 4 ids 0/33554432 sz 0=0A= >>>> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [.], ack 202749, win 501, options [nop,nop,TS val 267120482= 5 ecr 2073636037], length 0=0A= >>>> *** Network is now partitioned...=0A= >>>>=0A= >>>> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,= nop,TS val 2671204927 ecr 2073636037], length 232: NFS request xid 62993090= 1 228 getattr fh 0,1/53=0A= >>>> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,= nop,TS val 2671205135 ecr 2073636037], length 232: NFS request xid 62993090= 1 228 getattr fh 0,1/53=0A= >>>> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,= nop,TS val 2671205343 ecr 2073636037], length 232: NFS request xid 62993090= 1 228 getattr fh 0,1/53=0A= >>>> *** Lots of lines snipped.=0A= >>>>=0A= >>>>=0A= >>>> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> *** Network is now unpartitioned...=0A= >>>>=0A= >>>> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:7= 2 (oui Unknown), length 46=0A= >>>> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS = val 2671421871 ecr 0,nop,wscale 7], length 0=0A= >>>> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073855= 137 ecr 2671204825], length 0=0A= >>>> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >>>> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS = val 2671424943 ecr 0,nop,wscale 7], length 0=0A= >>>> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073856= 161 ecr 2671204825], length 0=0A= >>>> *** This "battle" goes on for 223sec...=0A= >>>> I snipped out 13 cycles of this "Linux sends an RST, followed by SYN"= =0A= >>>> "FreeBSD replies with same old ACK". In another test run I saw this=0A= >>>> cycle continue non-stop for several minutes. This time, the Linux=0A= >>>> client paused for a while (see ARPs below).=0A= >>>>=0A= >>>> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >>>> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS = val 2671428975 ecr 0,nop,wscale 7], length 0=0A= >>>> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073860= 193 ecr 2671204825], length 0=0A= >>>> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST.=0A= >>>>=0A= >>>> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:7= 2 (oui Unknown), length 46=0A= >>>> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell 192.168= .1.254, length 46=0A= >>>> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell 192.168= .1.254, length 46=0A= >>>> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,T= S val 2074046846 ecr 2671204825], length 0=0A= >>>> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,T= S val 2074062406 ecr 2671204825], length 0=0A= >>>> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [R.], seq 202750, ack 212293, win 0, options [nop,nop,TS va= l 2074077981 ecr 2671204825], length 0=0A= >>>> *** FreeBSD finally acknowledges the RST 38sec after Linux sent the la= st=0A= >>>> of 13 (100+ for another test run).=0A= >>>>=0A= >>>> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [S], seq 4247692373, win 64240, options [mss 1460,sackOK,TS= val 2671667055 ecr 0,nop,wscale 7], length 0=0A= >>>> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [S.], seq 661237469, ack 4247692374, win 65535, options [ms= s 1460,nop,wscale 6,sackOK,TS val 2074098278 ecr 2671667055], length 0=0A= >>>> *** Now back in business...=0A= >>>>=0A= >>>> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [.], ack 1, win 502, options [nop,nop,TS val 2671667055 ecr= 2074098278], length 0=0A= >>>> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, options [nop,nop,TS val 26= 71667056 ecr 2074098278], length 232: NFS request xid 629930901 228 getattr= fh 0,1/53=0A= >>>> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, options [nop,nop,TS val = 2671667056 ecr 2074098278], length 272: NFS request xid 697039765 132 getat= tr fh 0,1/53=0A= >>>> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [.], ack 505, win 29128, options [nop,nop,TS val 2074098279= ecr 2671667056], length 0=0A= >>>> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, options [nop,nop,TS val = 2671667056 ecr 2074098279], length 136: NFS request xid 730594197 132 getat= tr fh 0,1/53=0A= >>>> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [P.], seq 1:49, ack 505, win 29128, options [nop,nop,TS val= 2074098279 ecr 2671667056], length 48: NFS reply xid 697039765 reply ok 44= getattr ERROR: unk 10063=0A= >>>>=0A= >>>> This error 10063 after the partition heals is also "bad news". It indi= cates the Session=0A= >>>> (which is supposed to maintain "exactly once" RPC semantics is broken)= . I'll admit I=0A= >>>> suspect a Linux client bug, but will be investigating further.=0A= >>>>=0A= >>>> So, hopefully TCP conversant folk can confirm if the above is correct = behaviour=0A= >>>> or if the RST should be ack'd sooner?=0A= >>>>=0A= >>>> I could also see this becoming a "forever" TCP battle for other versio= ns of Linux client.=0A= >>>>=0A= >>>> rick=0A= >>>>=0A= >>>>=0A= >>>> ________________________________________=0A= >>>> From: Scheffenegger, Richard =0A= >>>> Sent: Sunday, April 4, 2021 7:50 AM=0A= >>>> To: Rick Macklem; tuexen@freebsd.org=0A= >>>> Cc: Youssef GHORBAL; freebsd-net@freebsd.org=0A= >>>> Subject: Re: NFS Mount Hangs=0A= >>>>=0A= >>>> CAUTION: This email originated from outside of the University of Guelp= h. Do not click links or open attachments unless you recognize the sender a= nd know the content is safe. If in doubt, forward suspicious emails to IThe= lp@uoguelph.ca=0A= >>>>=0A= >>>>=0A= >>>> For what it=91s worth, suse found two bugs in the linux nfconntrack (s= tateful firewall), and pfifo-fast scheduler, which could conspire to make t= cp sessions hang forever.=0A= >>>>=0A= >>>> One is a missed updaten when the c=F6ient is not using the noresvport = moint option, which makes tje firewall think rsts are illegal (and drop the= m);=0A= >>>>=0A= >>>> The fast scheduler can run into an issue if only a single packet shoul= d be forwarded (note that this is not the default scheduler, but often reco= mmended for perf, as it runs lockless and lower cpu cost that pfq (default)= . If no other/additional packet pushes out that last packet of a flow, it c= an become stuck forever...=0A= >>>>=0A= >>>> I can try getting the relevant bug info next week...=0A= >>>>=0A= >>>> ________________________________=0A= >>>> Von: owner-freebsd-net@freebsd.org im = Auftrag von Rick Macklem =0A= >>>> Gesendet: Friday, April 2, 2021 11:31:01 PM=0A= >>>> An: tuexen@freebsd.org =0A= >>>> Cc: Youssef GHORBAL ; freebsd-net@freebsd.= org =0A= >>>> Betreff: Re: NFS Mount Hangs=0A= >>>>=0A= >>>> NetApp Security WARNING: This is an external email. Do not click links= or open attachments unless you recognize the sender and know the content i= s safe.=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>> tuexen@freebsd.org wrote:=0A= >>>>>> On 2. Apr 2021, at 02:07, Rick Macklem wrote:= =0A= >>>>>>=0A= >>>>>> I hope you don't mind a top post...=0A= >>>>>> I've been testing network partitioning between the only Linux client= =0A= >>>>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch=0A= >>>>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken)=0A= >>>>>> applied to it.=0A= >>>>>>=0A= >>>>>> I'm not enough of a TCP guy to know if this is useful, but here's wh= at=0A= >>>>>> I see...=0A= >>>>>>=0A= >>>>>> While partitioned:=0A= >>>>>> On the FreeBSD server end, the socket either goes to CLOSED during= =0A= >>>>>> the network partition or stays ESTABLISHED.=0A= >>>>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also= =0A= >>>>> sent a FIN, but you never called close() on the socket.=0A= >>>>> If the socket stays in ESTABLISHED, there is no communication ongoing= ,=0A= >>>>> I guess, and therefore the server does not even detect that the peer= =0A= >>>>> is not reachable.=0A= >>>>>> On the Linux end, the socket seems to remain ESTABLISHED for a=0A= >>>>>> little while, and then disappears.=0A= >>>>> So how does Linux detect the peer is not reachable?=0A= >>>> Well, here's what I see in a packet capture in the Linux client once= =0A= >>>> I partition it (just unplug the net cable):=0A= >>>> - lots of retransmits of the same segment (with ACK) for 54sec=0A= >>>> - then only ARP queries=0A= >>>>=0A= >>>> Once I plug the net cable back in:=0A= >>>> - ARP works=0A= >>>> - one more retransmit of the same segement=0A= >>>> - receives RST from FreeBSD=0A= >>>> ** So, is this now a "new" TCP connection, despite=0A= >>>> using the same port#.=0A= >>>> --> It matters for NFS, since "new connection"=0A= >>>> implies "must retry all outstanding RPCs".=0A= >>>> - sends SYN=0A= >>>> - receives SYN, ACK from FreeBSD=0A= >>>> --> connection starts working again=0A= >>>> Always uses same port#.=0A= >>>>=0A= >>>> On the FreeBSD server end:=0A= >>>> - receives the last retransmit of the segment (with ACK)=0A= >>>> - sends RST=0A= >>>> - receives SYN=0A= >>>> - sends SYN, ACK=0A= >>>>=0A= >>>> I thought that there was no RST in the capture I looked at=0A= >>>> yesterday, so I'm not sure if FreeBSD always sends an RST,=0A= >>>> but the Linux client behaviour was the same. (Sent a SYN, etc).=0A= >>>> The socket disappears from the Linux "netstat -a" and I=0A= >>>> suspect that happens after about 54sec, but I am not sure=0A= >>>> about the timing.=0A= >>>>=0A= >>>>>>=0A= >>>>>> After unpartitioning:=0A= >>>>>> On the FreeBSD server end, you get another socket showing up at=0A= >>>>>> the same port#=0A= >>>>>> Active Internet connections (including servers)=0A= >>>>>> Proto Recv-Q Send-Q Local Address Foreign Address (s= tate)=0A= >>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 ES= TABLISHED=0A= >>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 CL= OSED=0A= >>>>>>=0A= >>>>>> The Linux client shows the same connection ESTABLISHED.=0A= >>>> But disappears from "netstat -a" for a while during the partitioning.= =0A= >>>>=0A= >>>>>> (The mount sometimes reports an error. I haven't looked at packet=0A= >>>>>> traces to see if it retries RPCs or why the errors occur.)=0A= >>>> I have now done so, as above.=0A= >>>>=0A= >>>>>> --> However I never get hangs.=0A= >>>>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server=0A= >>>>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the=0A= >>>>>> mount starts working again.=0A= >>>>>>=0A= >>>>>> The most obvious thing is that the Linux client always keeps using= =0A= >>>>>> the same port#. (The FreeBSD client will use a different port# when= =0A= >>>>>> it does a TCP reconnect after no response from the NFS server for=0A= >>>>>> a little while.)=0A= >>>>>>=0A= >>>>>> What do those TCP conversant think?=0A= >>>>> I guess you are you are never calling close() on the socket, for with= =0A= >>>>> the connection state is CLOSED.=0A= >>>> Ok, that makes sense. For this case the Linux client has not done a=0A= >>>> BindConnectionToSession to re-assign the back channel.=0A= >>>> I'll have to bug them about this. However, I'll bet they'll answer=0A= >>>> that I have to tell them the back channel needs re-assignment=0A= >>>> or something like that.=0A= >>>>=0A= >>>> I am pretty certain they are broken, in that the client needs to=0A= >>>> retry all outstanding RPCs.=0A= >>>>=0A= >>>> For others, here's the long winded version of this that I just=0A= >>>> put on the phabricator review:=0A= >>>> In the server side kernel RPC, the socket (struct socket *) is in a=0A= >>>> structure called SVCXPRT (normally pointed to by "xprt").=0A= >>>> These structures a ref counted and the soclose() is done=0A= >>>> when the ref. cnt goes to zero. My understanding is that=0A= >>>> "struct socket *" is free'd by soclose() so this cannot be done=0A= >>>> before the xprt ref. cnt goes to zero.=0A= >>>>=0A= >>>> For NFSv4.1/4.2 there is something called a back channel=0A= >>>> which means that a "xprt" is used for server->client RPCs,=0A= >>>> although the TCP connection is established by the client=0A= >>>> to the server.=0A= >>>> --> This back channel holds a ref cnt on "xprt" until the=0A= >>>>=0A= >>>> client re-assigns it to a different TCP connection=0A= >>>> via an operation called BindConnectionToSession=0A= >>>> and the Linux client is not doing this soon enough,=0A= >>>> it appears.=0A= >>>>=0A= >>>> So, the soclose() is delayed, which is why I think the=0A= >>>> TCP connection gets stuck in CLOSE_WAIT and that is=0A= >>>> why I've added the soshutdown(..SHUT_WR) calls,=0A= >>>> which can happen before the client gets around to=0A= >>>> re-assigning the back channel.=0A= >>>>=0A= >>>> Thanks for your help with this Michael, rick=0A= >>>>=0A= >>>> Best regards=0A= >>>> Michael=0A= >>>>>=0A= >>>>> rick=0A= >>>>> ps: I can capture packets while doing this, if anyone has a use=0A= >>>>> for them.=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>> ________________________________________=0A= >>>>> From: owner-freebsd-net@freebsd.org o= n behalf of Youssef GHORBAL =0A= >>>>> Sent: Saturday, March 27, 2021 6:57 PM=0A= >>>>> To: Jason Breitman=0A= >>>>> Cc: Rick Macklem; freebsd-net@freebsd.org=0A= >>>>> Subject: Re: NFS Mount Hangs=0A= >>>>>=0A= >>>>> CAUTION: This email originated from outside of the University of Guel= ph. Do not click links or open attachments unless you recognize the sender = and know the content is safe. If in doubt, forward suspicious emails to ITh= elp@uoguelph.ca=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>> On 27 Mar 2021, at 13:20, Jason Breitman > wrote:=0A= >>>>>=0A= >>>>> The issue happened again so we can say that disabling TSO and LRO on = the NIC did not resolve this issue.=0A= >>>>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwt= so=0A= >>>>> # ifconfig lagg0=0A= >>>>> lagg0: flags=3D8943 m= etric 0 mtu 1500=0A= >>>>> options=3D8100b8=0A= >>>>>=0A= >>>>> We can also say that the sysctl settings did not resolve this issue.= =0A= >>>>>=0A= >>>>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1=0A= >>>>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1=0A= >>>>>=0A= >>>>> # sysctl net.inet.tcp.finwait2_timeout=3D1000=0A= >>>>> net.inet.tcp.finwait2_timeout: 60000 -> 1000=0A= >>>>>=0A= >>>>> I don=92t think those will do anything in your case since the FIN_WAI= T2 are on the client side and those sysctls are for BSD.=0A= >>>>> By the way it seems that Linux recycles automatically TCP sessions in= FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout)=0A= >>>>>=0A= >>>>> tcp_fin_timeout (integer; default: 60; since Linux 2.2)=0A= >>>>> This specifies how many seconds to wait for a final FIN=0A= >>>>> packet before the socket is forcibly closed. This is=0A= >>>>> strictly a violation of the TCP specification, but=0A= >>>>> required to prevent denial-of-service attacks. In Linux=0A= >>>>> 2.2, the default value was 180.=0A= >>>>>=0A= >>>>> So I don=92t get why it stucks in the FIN_WAIT2 state anyway.=0A= >>>>>=0A= >>>>> You really need to have a packet capture during the outage (client an= d server side) so you=92ll get over the wire chat and start speculating fro= m there.=0A= >>>>> No need to capture the beginning of the outage for now. All you have = to do, is run a tcpdump for 10 minutes or so when you notice a client stuck= .=0A= >>>>>=0A= >>>>> * I have not rebooted the NFS Server nor have I restarted nfsd, but d= o not believe that is required as these settings are at the TCP level and I= would expect new sessions to use the updated settings.=0A= >>>>>=0A= >>>>> The issue occurred after 5 days following a reboot of the client mach= ines.=0A= >>>>> I ran the capture information again to make use of the situation.=0A= >>>>>=0A= >>>>> #!/bin/sh=0A= >>>>>=0A= >>>>> while true=0A= >>>>> do=0A= >>>>> /bin/date >> /tmp/nfs-hang.log=0A= >>>>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log=0A= >>>>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log=0A= >>>>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log=0A= >>>>> /bin/sleep 60=0A= >>>>> done=0A= >>>>>=0A= >>>>>=0A= >>>>> On the NFS Server=0A= >>>>> Active Internet connections (including servers)=0A= >>>>> Proto Recv-Q Send-Q Local Address Foreign Address (st= ate)=0A= >>>>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT=0A= >>>>>=0A= >>>>> On the NFS Client=0A= >>>>> tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>> You had also asked for the output below.=0A= >>>>>=0A= >>>>> # nfsstat -E -s=0A= >>>>> BackChannelCtBindConnToSes=0A= >>>>> 0 0=0A= >>>>>=0A= >>>>> # sysctl vfs.nfsd.request_space_throttle_count=0A= >>>>> vfs.nfsd.request_space_throttle_count: 0=0A= >>>>>=0A= >>>>> I see that you are testing a patch and I look forward to seeing the r= esults.=0A= >>>>>=0A= >>>>>=0A= >>>>> Jason Breitman=0A= >>>>>=0A= >>>>>=0A= >>>>> On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote:=0A= >>>>>=0A= >>>>> Youssef GHORBAL > wrote:=0A= >>>>>> Hi Jason,=0A= >>>>>>=0A= >>>>>>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote:=0A= >>>>>>>=0A= >>>>>>> Please review the details below and let me know if there is a setti= ng that I should apply to my FreeBSD NFS Server or if there is a bug fix th= at I can apply to resolve my issue.=0A= >>>>>>> I shared this information with the linux-nfs mailing list and they = believe the issue is on the server side.=0A= >>>>>>>=0A= >>>>>>> Issue=0A= >>>>>>> NFSv4 mounts periodically hang on the NFS Client.=0A= >>>>>>>=0A= >>>>>>> During this time, it is possible to manually mount from another NFS= Server on the NFS Client having issues.=0A= >>>>>>> Also, other NFS Clients are successfully mounting from the NFS Serv= er in question.=0A= >>>>>>> Rebooting the NFS Client appears to be the only solution.=0A= >>>>>>=0A= >>>>>> I had experienced a similar weird situation with periodically stuck = Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based but= they seem to have there >own nfsd)=0A= >>>>> Yes, my understanding is that Isilon uses a proprietary user space nf= sd and=0A= >>>>> not the kernel based RPC and nfsd in FreeBSD.=0A= >>>>>=0A= >>>>>> We=92ve had better luck and we did manage to have packet captures on= both sides >during the issue. The gist of it goes like follows:=0A= >>>>>>=0A= >>>>>> - Data flows correctly between SERVER and the CLIENT=0A= >>>>>> - At some point SERVER starts decreasing it's TCP Receive Window unt= il it reachs 0=0A= >>>>>> - The client (eager to send data) can only ack data sent by SERVER.= =0A= >>>>>> - When SERVER was done sending data, the client starts sending TCP W= indow >Probes hoping that the TCP Window opens again so he can flush its bu= ffers.=0A= >>>>>> - SERVER responds with a TCP Zero Window to those probes.=0A= >>>>> Having the window size drop to zero is not necessarily incorrect.=0A= >>>>> If the server is overloaded (has a backlog of NFS requests), it can s= top doing=0A= >>>>> soreceive() on the socket (so the socket rcv buffer can fill up and t= he TCP window=0A= >>>>> closes). This results in "backpressure" to stop the NFS client from f= looding the=0A= >>>>> NFS server with requests.=0A= >>>>> --> However, once the backlog is handled, the nfsd should start to so= receive()=0A= >>>>> again and this shouls cause the window to open back up.=0A= >>>>> --> Maybe this is broken in the socket/TCP code. I quickly got lost i= n=0A= >>>>> tcp_output() when it decides what to do about the rcvwin.=0A= >>>>>=0A= >>>>>> - After 6 minutes (the NFS server default Idle timeout) SERVER racef= ully closes the >TCP connection sending a FIN Packet (and still a TCP Windo= w 0)=0A= >>>>> This probably does not happen for Jason's case, since the 6minute tim= eout=0A= >>>>> is disabled when the TCP connection is assigned as a backchannel (mos= t likely=0A= >>>>> the case for NFSv4.1).=0A= >>>>>=0A= >>>>>> - CLIENT ACK that FIN.=0A= >>>>>> - SERVER goes in FIN_WAIT_2 state=0A= >>>>>> - CLIENT closes its half part part of the socket and goes in LAST_AC= K state.=0A= >>>>>> - FIN is never sent by the client since there still data in its Send= Q and receiver TCP >Window is still 0. At this stage the client starts send= ing TCP Window Probes again >and again hoping that the server opens its TCP= Window so it can flush it's buffers >and terminate its side of the socket.= =0A= >>>>>> - SERVER keeps responding with a TCP Zero Window to those probes.=0A= >>>>>> =3D> The last two steps goes on and on for hours/days freezing the N= FS mount bound >to that TCP session.=0A= >>>>>>=0A= >>>>>> If we had a situation where CLIENT was responsible for closing the T= CP Window (and >initiating the TCP FIN first) and server wanting to send da= ta we=92ll end up in the same >state as you I think.=0A= >>>>>>=0A= >>>>>> We=92ve never had the root cause of why the SERVER decided to close = the TCP >Window and no more acccept data, the fix on the Isilon part was to= recycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwa= it2_recycle=3D1 & >net.inet.tcp.finwait2_timeout=3D5000). Once the socket r= ecycled and at the next >occurence of CLIENT TCP Window probe, SERVER sends= a RST, triggering the >teardown of the session on the client side, a new T= CP handchake, etc and traffic >flows again (NFS starts responding)=0A= >>>>>>=0A= >>>>>> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 = was >implemented on the Isilon side) we=92ve added a check script on the cl= ient that detects >LAST_ACK sockets on the client and through iptables rule= enforces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr = --sport $local_port -j REJECT >--reject-with tcp-reset (the script removes = this iptables rule as soon as the LAST_ACK >disappears)=0A= >>>>>>=0A= >>>>>> The bottom line would be to have a packet capture during the outage = (client and/or >server side), it will show you at least the shape of the TC= P exchange when NFS is >stuck.=0A= >>>>> Interesting story and good work w.r.t. sluething, Youssef, thanks.=0A= >>>>>=0A= >>>>> I looked at Jason's log and it shows everything is ok w.r.t the nfsd = threads.=0A= >>>>> (They're just waiting for RPC requests.)=0A= >>>>> However, I do now think I know why the soclose() does not happen.=0A= >>>>> When the TCP connection is assigned as a backchannel, that takes a re= ference=0A= >>>>> cnt on the structure. This refcnt won't be released until the connect= ion is=0A= >>>>> replaced by a BindConnectiotoSession operation from the client. But t= hat won't=0A= >>>>> happen until the client creates a new TCP connection.=0A= >>>>> --> No refcnt release-->no refcnt of 0-->no soclose().=0A= >>>>>=0A= >>>>> I've created the attached patch (completely different from the previo= us one)=0A= >>>>> that adds soshutdown(SHUT_WR) calls in the three places where the TCP= =0A= >>>>> connection is going away. This seems to get it past CLOSE_WAIT withou= t a=0A= >>>>> soclose().=0A= >>>>> --> I know you are not comfortable with patching your server, but I d= o think=0A= >>>>> this change will get the socket shutdown to complete.=0A= >>>>>=0A= >>>>> There are a couple more things you can check on the server...=0A= >>>>> # nfsstat -E -s=0A= >>>>> --> Look for the count under "BindConnToSes".=0A= >>>>> --> If non-zero, backchannels have been assigned=0A= >>>>> # sysctl -a | fgrep request_space_throttle_count=0A= >>>>> --> If non-zero, the server has been overloaded at some point.=0A= >>>>>=0A= >>>>> I think the attached patch might work around the problem.=0A= >>>>> The code that should open up the receive window needs to be checked.= =0A= >>>>> I am also looking at enabling the 6minute timeout when a backchannel = is=0A= >>>>> assigned.=0A= >>>>>=0A= >>>>> rick=0A= >>>>>=0A= >>>>> Youssef=0A= >>>>>=0A= >>>>> _______________________________________________=0A= >>>>> freebsd-net@freebsd.org mailing list= =0A= >>>>> https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinf= o/freebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5= OZciLj4ObR_drJNv5yryaERfIbhKR2d$=0A= >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= "=0A= >>>>> =0A= >>>>>=0A= >>>>> =0A= >>>>>=0A= >>>>> _______________________________________________=0A= >>>>> freebsd-net@freebsd.org mailing list=0A= >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= "=0A= >>>>> _______________________________________________=0A= >>>>> freebsd-net@freebsd.org mailing list=0A= >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= "=0A= >>>>=0A= >>>> _______________________________________________=0A= >>>> freebsd-net@freebsd.org mailing list=0A= >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>>> _______________________________________________=0A= >>>> freebsd-net@freebsd.org mailing list=0A= >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>>=0A= >>=0A= >=0A= > _______________________________________________=0A= > freebsd-net@freebsd.org mailing list=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= > _______________________________________________=0A= > freebsd-net@freebsd.org mailing list=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-net =0A= > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org "=0A= =0A= _______________________________________________=0A= freebsd-net@freebsd.org mailing list=0A= https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= From owner-freebsd-net@freebsd.org Fri Apr 9 08:26:30 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D8F135CFA9E for ; Fri, 9 Apr 2021 08:26:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FGrnf5cm1z4Rmb for ; Fri, 9 Apr 2021 08:26:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id BF1E55CF7F1; Fri, 9 Apr 2021 08:26:30 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BEE5C5CF959 for ; Fri, 9 Apr 2021 08:26:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FGrnf4yhNz4S7q for ; Fri, 9 Apr 2021 08:26:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 9D7431AD06 for ; Fri, 9 Apr 2021 08:26:30 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1398QU7T098558 for ; Fri, 9 Apr 2021 08:26:30 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1398QUA3098557 for net@FreeBSD.org; Fri, 9 Apr 2021 08:26:30 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254333] [tcp] sysctl net.inet.tcp.hostcache.list hangs Date: Fri, 09 Apr 2021 08:26:29 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.4-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: maxim.shalomikhin@kaspersky.com X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: rscheff@freebsd.org X-Bugzilla-Flags: mfc-stable13? mfc-stable12? mfc-stable11? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Apr 2021 08:26:30 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254333 --- Comment #26 from Maxim Shalomikhin --- We have one more machine with hanging sysctl. # sysctl net.inet.tcp.hostcache net.inet.tcp.hostcache.purgenow: 0 net.inet.tcp.hostcache.purge: 0 net.inet.tcp.hostcache.prune: 300 net.inet.tcp.hostcache.expire: 3600 net.inet.tcp.hostcache.count: 4294961495 net.inet.tcp.hostcache.bucketlimit: 30 net.inet.tcp.hostcache.hashsize: 65536 net.inet.tcp.hostcache.cachelimit: 1966080 net.inet.tcp.hostcache.enable: 1 # netstat -sptcp ... 195494221 hostcache entries added 88163 bucket overflow ... loader.conf: accf_data_load=3D"YES" accf_dns_load=3D"YES"=20 accf_http_load=3D"YES" net.inet.tcp.tcbhashsize=3D131072 net.inet.tcp.syncache.hashsize=3D65536 net.inet.tcp.syncache.cachelimit=3D1966080 net.inet.tcp.hostcache.hashsize=3D65536 net.inet.tcp.hostcache.cachelimit=3D1966080 sysctl.conf: net.inet.icmp.drop_redirect=3D1 net.inet.icmp.icmplim=3D2000 net.inet.icmp.icmplim_output=3D1 net.inet.ip.fw.dyn_buckets=3D2048 net.inet.ip.fw.dyn_max=3D100000 net.inet.tcp.blackhole=3D2 net.inet.tcp.drop_synfin=3D1 net.inet.tcp.fast_finwait2_recycle=3D1 net.inet.tcp.msl=3D10000 net.inet.udp.blackhole=3D1 The issue reproduces every 3-4 months on each of 60 servers in different locations. All servers are Dell/IBM with different hw specs but all with ECC RAM, so I don't think this is HW issue. Please let me know any other information I can collect. --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Fri Apr 9 08:56:39 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id EA71F5D0360 for ; Fri, 9 Apr 2021 08:56:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FGsSR67dcz4TSs for ; Fri, 9 Apr 2021 08:56:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id D0B505D035F; Fri, 9 Apr 2021 08:56:39 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D07D15D0274 for ; Fri, 9 Apr 2021 08:56:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FGsSR5V8dz4TF7 for ; Fri, 9 Apr 2021 08:56:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id AED421AD7E for ; Fri, 9 Apr 2021 08:56:39 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 1398udcA014238 for ; Fri, 9 Apr 2021 08:56:39 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 1398udHw014237 for net@FreeBSD.org; Fri, 9 Apr 2021 08:56:39 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254333] [tcp] sysctl net.inet.tcp.hostcache.list hangs Date: Fri, 09 Apr 2021 08:56:38 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.4-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: rscheff@freebsd.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: rscheff@freebsd.org X-Bugzilla-Flags: mfc-stable13? mfc-stable12? mfc-stable11? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Apr 2021 08:56:40 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254333 --- Comment #27 from Richard Scheffenegger --- The "hang" is due to a waiting malloc requesting much more memory any machi= ne will currently have.=20 The root cause for that, however, is a race condition (likely to hit busy servers), fixed with D29522 (will apply cleanly after D29510). If you are able to patch your systems, that will address this issue. (The full set of these related hostcache improvements will be backported in about a week from now to stable/13, stable/12 and stable/11). --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-net@freebsd.org Sat Apr 10 00:44:09 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E5F5A5C4CB3 for ; Sat, 10 Apr 2021 00:44:09 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on061a.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::61a]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHGTh6GMtz4q18; Sat, 10 Apr 2021 00:44:08 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gCwW22lUPw5u4I6QkhV4HmENRBJbbNFYfyQeQTnFjUncxEQVW8bh9sFyiOwNg9n7GekE9u2nnBXKKe0s09O+cc/KJR4JNUPXygz0RloXUqUtbbZQ9auXb1rY4qnqlxV/mRJbROTx0iWyJJqAtoZupnE7+cNscT7V5m1ViqlOiWm0gw6lxDuXVAjw3hjaCqmT2Zt+tcrW/Q8eGl2v0+yyMZkgcimNKWYaxJ1aGOifZP8IDlWVKhljyXzd/2QzgnmQw+RMRJf3YP2Z1dKVA2XrFMpVJcRW4jjuCjL9Tgbd4+GQC3quvmR45UFhSmM7x3EpUvUHk2/iMq005Mr0LyNe8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mT2JnlWcTspXkxWxP3uflDyAA+IWa0AyrCLLHyi0QnI=; b=kj0aPFgOz5OdA54MrrauTvl/WKrUGMJ9n7wfW7/2QSHzlhMl3ucvIfO473fF4NWukB57qUVi6+Mud2eughR4ElYgSSaubG+TIcu200r2cyhKe+l/aLToJXmHwZLE7tFL5+DHnGi3Stjjjj0g1ECfPiBkFEw/NYkmS6QuOyw3oW7Wlbr0Gc4lN9A2kVzmzTSjRFWVkBQQeiZ7ZEjdpPgWBWi2r1UluW5GszPugOEXyDkq2caMdQMddrqodMy3kbpF+4EN5Y9gmmf/7sw3B4NXcepcINFRa6OmEOdsmDa8plRBgc8f4Gld8tGYzKm2hNXpDNtE7waSu3XhfydPvwY1QA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mT2JnlWcTspXkxWxP3uflDyAA+IWa0AyrCLLHyi0QnI=; b=YtlSpG7B58OJj4TnFjvU/Uo0lEJ1IKAAwBKs9VGuGYRXfcEdSgACbKft6Y+TC3o8ZLmtrvGNLrLUYMUAefDWU2tx2+wpgmEArmuys1gcgizh4HnoiLqGyuHEBSO0DncmRuLAs3tVWGUrcDRGW2Jdgk3GhfHAIQYNjZYYmTscGNtSDLv9L2sbZiPgd7lUb5yhstaLxQJ0DMb4Ywns3e3Ag4O3HzGAiZ9Z7Dd1XFOSESzsIxntH9w0rerqIhO+f//OOQHOZAbbAtateUoXIzlio77yzpa54RGFkFp6DJlVVpzDz0/BcxHdGExoj6/+ZuZfyajb62elHOFQ6VSfCjtGYQ== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR01MB4954.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c01:26::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.17; Sat, 10 Apr 2021 00:44:06 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.035; Sat, 10 Apr 2021 00:44:06 +0000 From: Rick Macklem To: "tuexen@freebsd.org" CC: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAG5EAgAA+DUKAAB1JAIAACEqkgAEOcgCAAI4UZoAAhYMAgAXXgNk= Date: Sat, 10 Apr 2021 00:44:06 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: bd744240-d973-4569-4676-08d8fbb9bec6 x-ms-traffictypediagnostic: YQXPR01MB4954: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:2089; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: dBIErLf8Cp3sdbeepV6An/H+iiLUI3n/yeW+pzpAMJjSlDSv3KPO5WtaYYMoqeP8yNCzh/YjoFsQMZA4G9czRC3qVt72ZJTm59+iS9wKjivj86AZJjhS1I6rA61APsds/WNFDpQH5XdHGa6DeDhAjueTkNyXPihCZYpcRfl51asBrI3ZZUa88r/mbR7m60T1sfjI0jdaLSRnvSGbRapoG+jbj+lBCNsh1r576j+z7hoDyf7OYhri4jvnuIgv6KczRMofTZ++Yj5erPm/jY+PugJBqsYPHS0De+BgU3o51UU9gmGI/ontWlu38pHHeCTXvwa+oEzEdxFfL10TvIbtZcC2MW+t0gwi0cgNfSR6Eq+qwHlMPIILTpHBjfKG1RG1W49e1CVYXdUtMjwObUgxmYURdenW5NLacogIU1BTKV5fpSv0iv/Xjs5nEfmzwN9Vr1iEAUP2bQr1dr7asVCdJS8V2CSFJV7jQgbzsi0MBrojg0LOU0xdt14iFeCYLHUwPM3W6rbXUdvrL4PoV7Db4CkgqGWjvLscWI0KVim2TceskX6fZlAqfYBfeDWDTZGae2Zk3qMXrUpbkIoQ5BqNwm1E8icK5+Tvj76sUOVZB2GSvrO7YUuN5gD56+cqNnsOa2jSTH4UeE1ZmkX61eFE0Xc0yJCjN65MseF/KaQQJdSP6Y40KhKEX7XhD1D/mAT1k0/GigZhjkX5eCIYUXXQWuhDAsqk4GZh4DpY8FAJAY8= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(136003)(346002)(376002)(396003)(366004)(39860400002)(7696005)(2906002)(5660300002)(4326008)(52536014)(86362001)(966005)(83380400001)(53546011)(6506007)(9686003)(7116003)(66556008)(3480700007)(786003)(8936002)(316002)(66574015)(71200400001)(38100700001)(478600001)(54906003)(76116006)(33656002)(91956017)(6916009)(64756008)(8676002)(30864003)(55016002)(66476007)(66946007)(186003)(66446008)(579004)(559001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?cw0Yztu1BC99jSjRHmk8B4A6hJxZw/4b2zKdywGMkF/QEYf4g0wv9hyj?= =?Windows-1252?Q?g8Khz0XtpX/P49Xdv7wx3tYBkng9jCAavOru+BPk7vpHbDw/VJ3aAILb?= =?Windows-1252?Q?PYkAkfS5NPmjIXREoCwZkAn2H8AU3hhuO7irVeaPFC5Oyr3vJWdkD7iS?= =?Windows-1252?Q?epr1/CjImbvM5Yjzs8K8PYoFLG2EMjviBjVc0lpiHbB8J2TBn3u87K4y?= =?Windows-1252?Q?aYM56aVgCci6YCad0Oul3eLQeeWdvJlh9jhPxyrRkiGNkcwDJm06OFvC?= =?Windows-1252?Q?Rm7kbdu8kSKF3YQbIPaQLK/f+yC2WHjGuaBzNefUCZ0XXMSGzcWv54O0?= =?Windows-1252?Q?rKCnYt951AczTMjaTskqLzI6xcVYLLlklMyNkcEBEUDCJj6k07YKbKKT?= =?Windows-1252?Q?j8PrQO8m9O+rCrOxZopfp5kMOBJ7olsf3zjzy/gcV+az6HbuoGkh4n4I?= =?Windows-1252?Q?hu/2IrcYSEFxyuzonWd0IICfGz/AGxgHlC/9jUWE7Hw+0Jaw6126NIDa?= =?Windows-1252?Q?PiSB/RTG1++EKQXsoJbHcjDRFk3NRYBFu8GLwS3JhJlPmS/H6XdbxEX/?= =?Windows-1252?Q?up9xqUYyTW03YXuOxAZbeNJSBS4jQFuzgGitf+YzYrcunPcC4wAQgrZz?= =?Windows-1252?Q?F0MQycw9kT7CSR1H8TBt7oH6VtgC9/x+H1H9QnhRcKK2ONfo1OpP8G3q?= =?Windows-1252?Q?WIDuEoinKM7ay2mc/LCKt6sPcEkFg/SeFHykp/EOOSQnHWFp5G0hojUI?= =?Windows-1252?Q?YGPj9jAJbn4TFsUrjfJ4/X53yaTO4nOGG7aVo6MG6gN0hklxYUwaaAGp?= =?Windows-1252?Q?d1oL96g3PlyixtOuOjBbg9ojp8Iq2VJggJ0OYeNimUXliANRYuHQTheM?= =?Windows-1252?Q?y+Y7qCUJmRPkQtOcZAgDFIHfruqK2HPge9iM4yOuO2d2tYjVd9dbLPoE?= =?Windows-1252?Q?PukifP1UTZvvoGL/WCIEzn8XOaxUJT4jdBI4E87vJ8he3LspyVLcZCth?= =?Windows-1252?Q?BFUQy0xIyV/TDibNK0gwpfsNVWXOIQhnug9o9VSkTautcpteYu0/DUTO?= =?Windows-1252?Q?TiVk8AWzlYiNXCJfaSwz3IWxl+7CDYzbTU4UoLmLt1ilOXV0bP657Rl1?= =?Windows-1252?Q?VT9rH+/cpAzh0O70CWkFs4wqxXFGcZvgeQHr3p95LwbIp0iE+HiLUNCB?= =?Windows-1252?Q?FsDvusXDR9tGP//G1xwuxmnW41Fd5vxaqq/kg24C33khAKMlNfGmYnSH?= =?Windows-1252?Q?6sABnxXLNR5OT/lGwepMFbTDy+eBSV3BVd97Kl68sPgl4ueIcN67HFSI?= =?Windows-1252?Q?Rae7V93Eqb+t8YYgYdjhZf2ZL7s08Eg/vvQ9AkPXz0ShGoCR3kxwmJhy?= =?Windows-1252?Q?8I14z71CgJ0gUQkdtl7Ixr3uIwWScIpK8Pd8+kZGqkUb6fkKy5c7LWXM?= =?Windows-1252?Q?HmkTK7kW+yC3e3eqjM/79vumIeDbyjHcAEQPM55HLIOUJLa3NOJsVGPt?= =?Windows-1252?Q?sHXy4qdQ?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: bd744240-d973-4569-4676-08d8fbb9bec6 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2021 00:44:06.0142 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: VI0IHHCMTX0L4M3vdAzS+JEadYL96h7Lu2PPN8q+0Qh9I7/sviBWyN2EthIFZFOf93YeEUhDin9lSonNiAGLqA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR01MB4954 X-Rspamd-Queue-Id: 4FHGTh6GMtz4q18 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=YtlSpG7B; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::61a as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-6.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5d::61a:from]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5d::61a:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 00:44:10 -0000 tuexen@freebsd.org wrote:=0A= >> On 6. Apr 2021, at 01:24, Rick Macklem wrote:=0A= >>=0A= >> tuexen@freebsd.org wrote:=0A= >> [stuff snipped]=0A= >>> OK. What is the FreeBSD version you are using?=0A= >> main Dec. 23, 2020.=0A= >>=0A= >>>=0A= >>> It seems that the TCP connection on the FreeBSD is still alive,=0A= >>> Linux has decided to start a new TCP connection using the old=0A= >>> port numbers. So it sends a SYN. The response is a challenge ACK=0A= >>> and Linux responds with a RST. This looks good so far. However,=0A= >>> FreeBSD should accept the RST and kill the TCP connection. The=0A= >>> next SYN from the Linux side would establish a new TCP connection.=0A= >>>=0A= >>> So I'm wondering why the RST is not accepted. I made the timestamp=0A= >>> checking stricter but introduced a bug where RST segments without=0A= >>> timestamps were ignored. This was fixed.=0A= >>>=0A= >>> Introduced in main on 2020/11/09:=0A= >>> https://svnweb.freebsd.org/changeset/base/367530=0A= >>> Introduced in stable/12 on 2020/11/30:=0A= >>> https://svnweb.freebsd.org/changeset/base/36818=0A= >>>> Fix in main on 2021/01/13:=0A= >>> https://cgit.FreeBSD.org/src/commit/?id=3Dcc3c34859eab1b317d0f38731355b= 53f7d978c97=0A= >>> Fix in stable/12 on 2021/01/24:=0A= >>> https://cgit.FreeBSD.org/src/commit/?id=3Dd05d908d6d3c85479c84c707f9311= 48439ae826b=0A= >>>=0A= >>> Are you using a version which is affected by this bug?=0A= >> I was. Now I've applied the patch.=0A= >> Bad News. It did not fix the problem.=0A= >> It still gets into an endless "ignore RST" and stay established when=0A= >> the Send-Q is empty.=0A= >OK. Let us focus on this case.=0A= >=0A= >Could you:=0A= >1. sudo sysctl net.inet.tcp.log_debug=3D1=0A= >2. repeat the situation where RSTs are ignored.=0A= >3. check if there is some output on the console (/var/log/messages).=0A= >4. Either provide the output or let me know that there is none.=0A= Well, I have some good news and some bad news (the bad is mostly for Richar= d).=0A= The only message logged is:=0A= tcpflags 0x4; tcp_do_segment: Timestamp missing, segment processed n= ormally=0A= =0A= But...the RST battle no longer occurs. Just one RST that works and then=0A= the SYN gets SYN,ACK'd by the FreeBSD end and off it goes...=0A= =0A= So, what is different?=0A= =0A= r367492 is reverted from the FreeBSD server.=0A= I did the revert because I think it might be what otis@ hang is being=0A= caused by. (In his case, the Recv-Q grows on the socket for the=0A= stuck Linux client, while others work.=0A= =0A= Why does reverting fix this?=0A= My only guess is that the krpc gets the upcall right away and sees=0A= a EPIPE when it does soreceive()->results in soshutdown(SHUT_WR).=0A= I know from a printf that this happened, but whether it caused the=0A= RST battle to not happen, I don't know.=0A= =0A= I can put r367492 back in and do more testing if you'd like, but=0A= I think it probably needs to be reverted?=0A= =0A= This does not explain the original hung Linux client problem,=0A= but does shed light on the RST war I could create by doing a=0A= network partitioning.=0A= =0A= rick=0A= =0A= Best regards=0A= Michael=0A= >=0A= > If the Send-Q is non-empty when I partition, it recovers fine,=0A= > sometimes not even needing to see an RST.=0A= >=0A= > rick=0A= > ps: If you think there might be other recent changes that matter,=0A= > just say the word and I'll upgrade to bits de jur.=0A= >=0A= > rick=0A= >=0A= > Best regards=0A= > Michael=0A= >>=0A= >> If I wait long enough before healing the partition, it will=0A= >> go to FIN_WAIT_1, and then if I plug it back in, it does not=0A= >> do battle (at least not for long).=0A= >>=0A= >> Btw, I have one running now that seems stuck really good.=0A= >> It has been 20minutes since I plugged the net cable back in.=0A= >> (Unfortunately, I didn't have tcpdump running until after=0A= >> I saw it was not progressing after healing.=0A= >> --> There is one difference. There was a 6minute timeout=0A= >> enabled on the server krpc for "no activity", which is=0A= >> now disabled like it is for NFSv4.1 in freebsd-current.=0A= >> I had forgotten to re-disable it.=0A= >> So, when it does battle, it might have been the 6minute=0A= >> timeout, which would then do the soshutdown(..SHUT_WR)=0A= >> which kept it from getting "stuck" forever.=0A= >> -->This time I had to reboot the FreeBSD NFS server to=0A= >> get the Linux client unstuck, so this one looked a lot=0A= >> like what has been reported.=0A= >> The pcap for this one, started after the network was plugged=0A= >> back in and I noticed it was stuck for quite a while is here:=0A= >> fetch https://people.freebsd.org/~rmacklem/stuck.pcap=0A= >>=0A= >> In it, there is just a bunch of RST followed by SYN sent=0A= >> from client->FreeBSD and FreeBSD just keeps sending=0A= >> acks for the old segment back.=0A= >> --> It looks like FreeBSD did the "RST, ACK" after the=0A= >> krpc did a soshutdown(..SHUT_WR) on the socket,=0A= >> for the one you've been looking at.=0A= >> I'll test some more...=0A= >>=0A= >>> I would like to understand why the reestablishment of the connection=0A= >>> did not work...=0A= >> It is looking like it takes either a non-empty send-q or a=0A= >> soshutdown(..SHUT_WR) to get the FreeBSD socket=0A= >> out of established, where it just ignores the RSTs and=0A= >> SYN packets.=0A= >>=0A= >> Thanks for looking at it, rick=0A= >>=0A= >> Best regards=0A= >> Michael=0A= >>>=0A= >>> Have fun with it, rick=0A= >>>=0A= >>>=0A= >>> ________________________________________=0A= >>> From: tuexen@freebsd.org =0A= >>> Sent: Sunday, April 4, 2021 12:41 PM=0A= >>> To: Rick Macklem=0A= >>> Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org=0A= >>> Subject: Re: NFS Mount Hangs=0A= >>>=0A= >>> CAUTION: This email originated from outside of the University of Guelph= . Do not click links or open attachments unless you recognize the sender an= d know the content is safe. If in doubt, forward suspicious emails to IThel= p@uoguelph.ca=0A= >>>=0A= >>>=0A= >>>> On 4. Apr 2021, at 17:27, Rick Macklem wrote:= =0A= >>>>=0A= >>>> Well, I'm going to cheat and top post, since this is elated info. and= =0A= >>>> not really part of the discussion...=0A= >>>>=0A= >>>> I've been testing network partitioning between a Linux client (5.2 ker= nel)=0A= >>>> and a FreeBSD-current NFS server. I have not gotten a solid hang, but= =0A= >>>> I have had the Linux client doing "battle" with the FreeBSD server for= =0A= >>>> several minutes after un-partitioning the connection.=0A= >>>>=0A= >>>> The battle basically consists of the Linux client sending an RST, foll= owed=0A= >>>> by a SYN.=0A= >>>> The FreeBSD server ignores the RST and just replies with the same old = ack.=0A= >>>> --> This varies from "just a SYN" that succeeds to 100+ cycles of the = above=0A= >>>> over several minutes.=0A= >>>>=0A= >>>> I had thought that an RST was a "pretty heavy hammer", but FreeBSD see= ms=0A= >>>> pretty good at ignoring it.=0A= >>>>=0A= >>>> A full packet capture of one of these is in /home/rmacklem/linuxtofree= nfs.pcap=0A= >>>> in case anyone wants to look at it.=0A= >>> On freefall? I would like to take a look at it...=0A= >>>=0A= >>> Best regards=0A= >>> Michael=0A= >>>>=0A= >>>> Here's a tcpdump snippet of the interesting part (see the *** comments= ):=0A= >>>> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [P.], seq 202585:202749, ack 212293, win 29128, options [no= p,nop,TS val 2073636037 ecr 2671204825], length 164: NFS reply xid 61315368= 5 reply ok 160 getattr NON 4 ids 0/33554432 sz 0=0A= >>>> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [.], ack 202749, win 501, options [nop,nop,TS val 267120482= 5 ecr 2073636037], length 0=0A= >>>> *** Network is now partitioned...=0A= >>>>=0A= >>>> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,= nop,TS val 2671204927 ecr 2073636037], length 232: NFS request xid 62993090= 1 228 getattr fh 0,1/53=0A= >>>> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,= nop,TS val 2671205135 ecr 2073636037], length 232: NFS request xid 62993090= 1 228 getattr fh 0,1/53=0A= >>>> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop,= nop,TS val 2671205343 ecr 2073636037], length 232: NFS request xid 62993090= 1 228 getattr fh 0,1/53=0A= >>>> *** Lots of lines snipped.=0A= >>>>=0A= >>>>=0A= >>>> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> *** Network is now unpartitioned...=0A= >>>>=0A= >>>> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:7= 2 (oui Unknown), length 46=0A= >>>> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS = val 2671421871 ecr 0,nop,wscale 7], length 0=0A= >>>> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073855= 137 ecr 2671204825], length 0=0A= >>>> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >>>> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS = val 2671424943 ecr 0,nop,wscale 7], length 0=0A= >>>> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073856= 161 ecr 2671204825], length 0=0A= >>>> *** This "battle" goes on for 223sec...=0A= >>>> I snipped out 13 cycles of this "Linux sends an RST, followed by SYN"= =0A= >>>> "FreeBSD replies with same old ACK". In another test run I saw this=0A= >>>> cycle continue non-stop for several minutes. This time, the Linux=0A= >>>> client paused for a while (see ARPs below).=0A= >>>>=0A= >>>> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >>>> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS = val 2671428975 ecr 0,nop,wscale 7], length 0=0A= >>>> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 2073860= 193 ecr 2671204825], length 0=0A= >>>> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST.=0A= >>>>=0A= >>>> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-l= inux.home.rick, length 28=0A= >>>> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:7= 2 (oui Unknown), length 46=0A= >>>> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell 192.168= .1.254, length 46=0A= >>>> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell 192.168= .1.254, length 46=0A= >>>> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,T= S val 2074046846 ecr 2671204825], length 0=0A= >>>> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,T= S val 2074062406 ecr 2671204825], length 0=0A= >>>> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [R.], seq 202750, ack 212293, win 0, options [nop,nop,TS va= l 2074077981 ecr 2671204825], length 0=0A= >>>> *** FreeBSD finally acknowledges the RST 38sec after Linux sent the la= st=0A= >>>> of 13 (100+ for another test run).=0A= >>>>=0A= >>>> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [S], seq 4247692373, win 64240, options [mss 1460,sackOK,TS= val 2671667055 ecr 0,nop,wscale 7], length 0=0A= >>>> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [S.], seq 661237469, ack 4247692374, win 65535, options [ms= s 1460,nop,wscale 6,sackOK,TS val 2074098278 ecr 2671667055], length 0=0A= >>>> *** Now back in business...=0A= >>>>=0A= >>>> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [.], ack 1, win 502, options [nop,nop,TS val 2671667055 ecr= 2074098278], length 0=0A= >>>> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, options [nop,nop,TS val 26= 71667056 ecr 2074098278], length 232: NFS request xid 629930901 228 getattr= fh 0,1/53=0A= >>>> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, options [nop,nop,TS val = 2671667056 ecr 2074098278], length 272: NFS request xid 697039765 132 getat= tr fh 0,1/53=0A= >>>> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [.], ack 505, win 29128, options [nop,nop,TS val 2074098279= ecr 2671667056], length 0=0A= >>>> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.r= ick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, options [nop,nop,TS val = 2671667056 ecr 2074098279], length 136: NFS request xid 730594197 132 getat= tr fh 0,1/53=0A= >>>> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.a= pex-mesh: Flags [P.], seq 1:49, ack 505, win 29128, options [nop,nop,TS val= 2074098279 ecr 2671667056], length 48: NFS reply xid 697039765 reply ok 44= getattr ERROR: unk 10063=0A= >>>>=0A= >>>> This error 10063 after the partition heals is also "bad news". It indi= cates the Session=0A= >>>> (which is supposed to maintain "exactly once" RPC semantics is broken)= . I'll admit I=0A= >>>> suspect a Linux client bug, but will be investigating further.=0A= >>>>=0A= >>>> So, hopefully TCP conversant folk can confirm if the above is correct = behaviour=0A= >>>> or if the RST should be ack'd sooner?=0A= >>>>=0A= >>>> I could also see this becoming a "forever" TCP battle for other versio= ns of Linux client.=0A= >>>>=0A= >>>> rick=0A= >>>>=0A= >>>>=0A= >>>> ________________________________________=0A= >>>> From: Scheffenegger, Richard =0A= >>>> Sent: Sunday, April 4, 2021 7:50 AM=0A= >>>> To: Rick Macklem; tuexen@freebsd.org=0A= >>>> Cc: Youssef GHORBAL; freebsd-net@freebsd.org=0A= >>>> Subject: Re: NFS Mount Hangs=0A= >>>>=0A= >>>> CAUTION: This email originated from outside of the University of Guelp= h. Do not click links or open attachments unless you recognize the sender a= nd know the content is safe. If in doubt, forward suspicious emails to IThe= lp@uoguelph.ca=0A= >>>>=0A= >>>>=0A= >>>> For what it=91s worth, suse found two bugs in the linux nfconntrack (s= tateful firewall), and pfifo-fast scheduler, which could conspire to make t= cp sessions hang forever.=0A= >>>>=0A= >>>> One is a missed updaten when the c=F6ient is not using the noresvport = moint option, which makes tje firewall think rsts are illegal (and drop the= m);=0A= >>>>=0A= >>>> The fast scheduler can run into an issue if only a single packet shoul= d be forwarded (note that this is not the default scheduler, but often reco= mmended for perf, as it runs lockless and lower cpu cost that pfq (default)= . If no other/additional packet pushes out that last packet of a flow, it c= an become stuck forever...=0A= >>>>=0A= >>>> I can try getting the relevant bug info next week...=0A= >>>>=0A= >>>> ________________________________=0A= >>>> Von: owner-freebsd-net@freebsd.org im = Auftrag von Rick Macklem =0A= >>>> Gesendet: Friday, April 2, 2021 11:31:01 PM=0A= >>>> An: tuexen@freebsd.org =0A= >>>> Cc: Youssef GHORBAL ; freebsd-net@freebsd.= org =0A= >>>> Betreff: Re: NFS Mount Hangs=0A= >>>>=0A= >>>> NetApp Security WARNING: This is an external email. Do not click links= or open attachments unless you recognize the sender and know the content i= s safe.=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>>=0A= >>>> tuexen@freebsd.org wrote:=0A= >>>>>> On 2. Apr 2021, at 02:07, Rick Macklem wrote:= =0A= >>>>>>=0A= >>>>>> I hope you don't mind a top post...=0A= >>>>>> I've been testing network partitioning between the only Linux client= =0A= >>>>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch=0A= >>>>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken)=0A= >>>>>> applied to it.=0A= >>>>>>=0A= >>>>>> I'm not enough of a TCP guy to know if this is useful, but here's wh= at=0A= >>>>>> I see...=0A= >>>>>>=0A= >>>>>> While partitioned:=0A= >>>>>> On the FreeBSD server end, the socket either goes to CLOSED during= =0A= >>>>>> the network partition or stays ESTABLISHED.=0A= >>>>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer also= =0A= >>>>> sent a FIN, but you never called close() on the socket.=0A= >>>>> If the socket stays in ESTABLISHED, there is no communication ongoing= ,=0A= >>>>> I guess, and therefore the server does not even detect that the peer= =0A= >>>>> is not reachable.=0A= >>>>>> On the Linux end, the socket seems to remain ESTABLISHED for a=0A= >>>>>> little while, and then disappears.=0A= >>>>> So how does Linux detect the peer is not reachable?=0A= >>>> Well, here's what I see in a packet capture in the Linux client once= =0A= >>>> I partition it (just unplug the net cable):=0A= >>>> - lots of retransmits of the same segment (with ACK) for 54sec=0A= >>>> - then only ARP queries=0A= >>>>=0A= >>>> Once I plug the net cable back in:=0A= >>>> - ARP works=0A= >>>> - one more retransmit of the same segement=0A= >>>> - receives RST from FreeBSD=0A= >>>> ** So, is this now a "new" TCP connection, despite=0A= >>>> using the same port#.=0A= >>>> --> It matters for NFS, since "new connection"=0A= >>>> implies "must retry all outstanding RPCs".=0A= >>>> - sends SYN=0A= >>>> - receives SYN, ACK from FreeBSD=0A= >>>> --> connection starts working again=0A= >>>> Always uses same port#.=0A= >>>>=0A= >>>> On the FreeBSD server end:=0A= >>>> - receives the last retransmit of the segment (with ACK)=0A= >>>> - sends RST=0A= >>>> - receives SYN=0A= >>>> - sends SYN, ACK=0A= >>>>=0A= >>>> I thought that there was no RST in the capture I looked at=0A= >>>> yesterday, so I'm not sure if FreeBSD always sends an RST,=0A= >>>> but the Linux client behaviour was the same. (Sent a SYN, etc).=0A= >>>> The socket disappears from the Linux "netstat -a" and I=0A= >>>> suspect that happens after about 54sec, but I am not sure=0A= >>>> about the timing.=0A= >>>>=0A= >>>>>>=0A= >>>>>> After unpartitioning:=0A= >>>>>> On the FreeBSD server end, you get another socket showing up at=0A= >>>>>> the same port#=0A= >>>>>> Active Internet connections (including servers)=0A= >>>>>> Proto Recv-Q Send-Q Local Address Foreign Address (s= tate)=0A= >>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 ES= TABLISHED=0A= >>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 CL= OSED=0A= >>>>>>=0A= >>>>>> The Linux client shows the same connection ESTABLISHED.=0A= >>>> But disappears from "netstat -a" for a while during the partitioning.= =0A= >>>>=0A= >>>>>> (The mount sometimes reports an error. I haven't looked at packet=0A= >>>>>> traces to see if it retries RPCs or why the errors occur.)=0A= >>>> I have now done so, as above.=0A= >>>>=0A= >>>>>> --> However I never get hangs.=0A= >>>>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server=0A= >>>>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the=0A= >>>>>> mount starts working again.=0A= >>>>>>=0A= >>>>>> The most obvious thing is that the Linux client always keeps using= =0A= >>>>>> the same port#. (The FreeBSD client will use a different port# when= =0A= >>>>>> it does a TCP reconnect after no response from the NFS server for=0A= >>>>>> a little while.)=0A= >>>>>>=0A= >>>>>> What do those TCP conversant think?=0A= >>>>> I guess you are you are never calling close() on the socket, for with= =0A= >>>>> the connection state is CLOSED.=0A= >>>> Ok, that makes sense. For this case the Linux client has not done a=0A= >>>> BindConnectionToSession to re-assign the back channel.=0A= >>>> I'll have to bug them about this. However, I'll bet they'll answer=0A= >>>> that I have to tell them the back channel needs re-assignment=0A= >>>> or something like that.=0A= >>>>=0A= >>>> I am pretty certain they are broken, in that the client needs to=0A= >>>> retry all outstanding RPCs.=0A= >>>>=0A= >>>> For others, here's the long winded version of this that I just=0A= >>>> put on the phabricator review:=0A= >>>> In the server side kernel RPC, the socket (struct socket *) is in a=0A= >>>> structure called SVCXPRT (normally pointed to by "xprt").=0A= >>>> These structures a ref counted and the soclose() is done=0A= >>>> when the ref. cnt goes to zero. My understanding is that=0A= >>>> "struct socket *" is free'd by soclose() so this cannot be done=0A= >>>> before the xprt ref. cnt goes to zero.=0A= >>>>=0A= >>>> For NFSv4.1/4.2 there is something called a back channel=0A= >>>> which means that a "xprt" is used for server->client RPCs,=0A= >>>> although the TCP connection is established by the client=0A= >>>> to the server.=0A= >>>> --> This back channel holds a ref cnt on "xprt" until the=0A= >>>>=0A= >>>> client re-assigns it to a different TCP connection=0A= >>>> via an operation called BindConnectionToSession=0A= >>>> and the Linux client is not doing this soon enough,=0A= >>>> it appears.=0A= >>>>=0A= >>>> So, the soclose() is delayed, which is why I think the=0A= >>>> TCP connection gets stuck in CLOSE_WAIT and that is=0A= >>>> why I've added the soshutdown(..SHUT_WR) calls,=0A= >>>> which can happen before the client gets around to=0A= >>>> re-assigning the back channel.=0A= >>>>=0A= >>>> Thanks for your help with this Michael, rick=0A= >>>>=0A= >>>> Best regards=0A= >>>> Michael=0A= >>>>>=0A= >>>>> rick=0A= >>>>> ps: I can capture packets while doing this, if anyone has a use=0A= >>>>> for them.=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>> ________________________________________=0A= >>>>> From: owner-freebsd-net@freebsd.org o= n behalf of Youssef GHORBAL =0A= >>>>> Sent: Saturday, March 27, 2021 6:57 PM=0A= >>>>> To: Jason Breitman=0A= >>>>> Cc: Rick Macklem; freebsd-net@freebsd.org=0A= >>>>> Subject: Re: NFS Mount Hangs=0A= >>>>>=0A= >>>>> CAUTION: This email originated from outside of the University of Guel= ph. Do not click links or open attachments unless you recognize the sender = and know the content is safe. If in doubt, forward suspicious emails to ITh= elp@uoguelph.ca=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>> On 27 Mar 2021, at 13:20, Jason Breitman > wrote:=0A= >>>>>=0A= >>>>> The issue happened again so we can say that disabling TSO and LRO on = the NIC did not resolve this issue.=0A= >>>>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwt= so=0A= >>>>> # ifconfig lagg0=0A= >>>>> lagg0: flags=3D8943 m= etric 0 mtu 1500=0A= >>>>> options=3D8100b8=0A= >>>>>=0A= >>>>> We can also say that the sysctl settings did not resolve this issue.= =0A= >>>>>=0A= >>>>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1=0A= >>>>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1=0A= >>>>>=0A= >>>>> # sysctl net.inet.tcp.finwait2_timeout=3D1000=0A= >>>>> net.inet.tcp.finwait2_timeout: 60000 -> 1000=0A= >>>>>=0A= >>>>> I don=92t think those will do anything in your case since the FIN_WAI= T2 are on the client side and those sysctls are for BSD.=0A= >>>>> By the way it seems that Linux recycles automatically TCP sessions in= FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout)=0A= >>>>>=0A= >>>>> tcp_fin_timeout (integer; default: 60; since Linux 2.2)=0A= >>>>> This specifies how many seconds to wait for a final FIN=0A= >>>>> packet before the socket is forcibly closed. This is=0A= >>>>> strictly a violation of the TCP specification, but=0A= >>>>> required to prevent denial-of-service attacks. In Linux=0A= >>>>> 2.2, the default value was 180.=0A= >>>>>=0A= >>>>> So I don=92t get why it stucks in the FIN_WAIT2 state anyway.=0A= >>>>>=0A= >>>>> You really need to have a packet capture during the outage (client an= d server side) so you=92ll get over the wire chat and start speculating fro= m there.=0A= >>>>> No need to capture the beginning of the outage for now. All you have = to do, is run a tcpdump for 10 minutes or so when you notice a client stuck= .=0A= >>>>>=0A= >>>>> * I have not rebooted the NFS Server nor have I restarted nfsd, but d= o not believe that is required as these settings are at the TCP level and I= would expect new sessions to use the updated settings.=0A= >>>>>=0A= >>>>> The issue occurred after 5 days following a reboot of the client mach= ines.=0A= >>>>> I ran the capture information again to make use of the situation.=0A= >>>>>=0A= >>>>> #!/bin/sh=0A= >>>>>=0A= >>>>> while true=0A= >>>>> do=0A= >>>>> /bin/date >> /tmp/nfs-hang.log=0A= >>>>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log=0A= >>>>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log=0A= >>>>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log=0A= >>>>> /bin/sleep 60=0A= >>>>> done=0A= >>>>>=0A= >>>>>=0A= >>>>> On the NFS Server=0A= >>>>> Active Internet connections (including servers)=0A= >>>>> Proto Recv-Q Send-Q Local Address Foreign Address (st= ate)=0A= >>>>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT=0A= >>>>>=0A= >>>>> On the NFS Client=0A= >>>>> tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>> You had also asked for the output below.=0A= >>>>>=0A= >>>>> # nfsstat -E -s=0A= >>>>> BackChannelCtBindConnToSes=0A= >>>>> 0 0=0A= >>>>>=0A= >>>>> # sysctl vfs.nfsd.request_space_throttle_count=0A= >>>>> vfs.nfsd.request_space_throttle_count: 0=0A= >>>>>=0A= >>>>> I see that you are testing a patch and I look forward to seeing the r= esults.=0A= >>>>>=0A= >>>>>=0A= >>>>> Jason Breitman=0A= >>>>>=0A= >>>>>=0A= >>>>> On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote:=0A= >>>>>=0A= >>>>> Youssef GHORBAL > wrote:=0A= >>>>>> Hi Jason,=0A= >>>>>>=0A= >>>>>>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote:=0A= >>>>>>>=0A= >>>>>>> Please review the details below and let me know if there is a setti= ng that I should apply to my FreeBSD NFS Server or if there is a bug fix th= at I can apply to resolve my issue.=0A= >>>>>>> I shared this information with the linux-nfs mailing list and they = believe the issue is on the server side.=0A= >>>>>>>=0A= >>>>>>> Issue=0A= >>>>>>> NFSv4 mounts periodically hang on the NFS Client.=0A= >>>>>>>=0A= >>>>>>> During this time, it is possible to manually mount from another NFS= Server on the NFS Client having issues.=0A= >>>>>>> Also, other NFS Clients are successfully mounting from the NFS Serv= er in question.=0A= >>>>>>> Rebooting the NFS Client appears to be the only solution.=0A= >>>>>>=0A= >>>>>> I had experienced a similar weird situation with periodically stuck = Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based but= they seem to have there >own nfsd)=0A= >>>>> Yes, my understanding is that Isilon uses a proprietary user space nf= sd and=0A= >>>>> not the kernel based RPC and nfsd in FreeBSD.=0A= >>>>>=0A= >>>>>> We=92ve had better luck and we did manage to have packet captures on= both sides >during the issue. The gist of it goes like follows:=0A= >>>>>>=0A= >>>>>> - Data flows correctly between SERVER and the CLIENT=0A= >>>>>> - At some point SERVER starts decreasing it's TCP Receive Window unt= il it reachs 0=0A= >>>>>> - The client (eager to send data) can only ack data sent by SERVER.= =0A= >>>>>> - When SERVER was done sending data, the client starts sending TCP W= indow >Probes hoping that the TCP Window opens again so he can flush its bu= ffers.=0A= >>>>>> - SERVER responds with a TCP Zero Window to those probes.=0A= >>>>> Having the window size drop to zero is not necessarily incorrect.=0A= >>>>> If the server is overloaded (has a backlog of NFS requests), it can s= top doing=0A= >>>>> soreceive() on the socket (so the socket rcv buffer can fill up and t= he TCP window=0A= >>>>> closes). This results in "backpressure" to stop the NFS client from f= looding the=0A= >>>>> NFS server with requests.=0A= >>>>> --> However, once the backlog is handled, the nfsd should start to so= receive()=0A= >>>>> again and this shouls cause the window to open back up.=0A= >>>>> --> Maybe this is broken in the socket/TCP code. I quickly got lost i= n=0A= >>>>> tcp_output() when it decides what to do about the rcvwin.=0A= >>>>>=0A= >>>>>> - After 6 minutes (the NFS server default Idle timeout) SERVER racef= ully closes the >TCP connection sending a FIN Packet (and still a TCP Windo= w 0)=0A= >>>>> This probably does not happen for Jason's case, since the 6minute tim= eout=0A= >>>>> is disabled when the TCP connection is assigned as a backchannel (mos= t likely=0A= >>>>> the case for NFSv4.1).=0A= >>>>>=0A= >>>>>> - CLIENT ACK that FIN.=0A= >>>>>> - SERVER goes in FIN_WAIT_2 state=0A= >>>>>> - CLIENT closes its half part part of the socket and goes in LAST_AC= K state.=0A= >>>>>> - FIN is never sent by the client since there still data in its Send= Q and receiver TCP >Window is still 0. At this stage the client starts send= ing TCP Window Probes again >and again hoping that the server opens its TCP= Window so it can flush it's buffers >and terminate its side of the socket.= =0A= >>>>>> - SERVER keeps responding with a TCP Zero Window to those probes.=0A= >>>>>> =3D> The last two steps goes on and on for hours/days freezing the N= FS mount bound >to that TCP session.=0A= >>>>>>=0A= >>>>>> If we had a situation where CLIENT was responsible for closing the T= CP Window (and >initiating the TCP FIN first) and server wanting to send da= ta we=92ll end up in the same >state as you I think.=0A= >>>>>>=0A= >>>>>> We=92ve never had the root cause of why the SERVER decided to close = the TCP >Window and no more acccept data, the fix on the Isilon part was to= recycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finwa= it2_recycle=3D1 & >net.inet.tcp.finwait2_timeout=3D5000). Once the socket r= ecycled and at the next >occurence of CLIENT TCP Window probe, SERVER sends= a RST, triggering the >teardown of the session on the client side, a new T= CP handchake, etc and traffic >flows again (NFS starts responding)=0A= >>>>>>=0A= >>>>>> To avoid rebooting the client (and before the aggressive FIN_WAIT_2 = was >implemented on the Isilon side) we=92ve added a check script on the cl= ient that detects >LAST_ACK sockets on the client and through iptables rule= enforces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr = --sport $local_port -j REJECT >--reject-with tcp-reset (the script removes = this iptables rule as soon as the LAST_ACK >disappears)=0A= >>>>>>=0A= >>>>>> The bottom line would be to have a packet capture during the outage = (client and/or >server side), it will show you at least the shape of the TC= P exchange when NFS is >stuck.=0A= >>>>> Interesting story and good work w.r.t. sluething, Youssef, thanks.=0A= >>>>>=0A= >>>>> I looked at Jason's log and it shows everything is ok w.r.t the nfsd = threads.=0A= >>>>> (They're just waiting for RPC requests.)=0A= >>>>> However, I do now think I know why the soclose() does not happen.=0A= >>>>> When the TCP connection is assigned as a backchannel, that takes a re= ference=0A= >>>>> cnt on the structure. This refcnt won't be released until the connect= ion is=0A= >>>>> replaced by a BindConnectiotoSession operation from the client. But t= hat won't=0A= >>>>> happen until the client creates a new TCP connection.=0A= >>>>> --> No refcnt release-->no refcnt of 0-->no soclose().=0A= >>>>>=0A= >>>>> I've created the attached patch (completely different from the previo= us one)=0A= >>>>> that adds soshutdown(SHUT_WR) calls in the three places where the TCP= =0A= >>>>> connection is going away. This seems to get it past CLOSE_WAIT withou= t a=0A= >>>>> soclose().=0A= >>>>> --> I know you are not comfortable with patching your server, but I d= o think=0A= >>>>> this change will get the socket shutdown to complete.=0A= >>>>>=0A= >>>>> There are a couple more things you can check on the server...=0A= >>>>> # nfsstat -E -s=0A= >>>>> --> Look for the count under "BindConnToSes".=0A= >>>>> --> If non-zero, backchannels have been assigned=0A= >>>>> # sysctl -a | fgrep request_space_throttle_count=0A= >>>>> --> If non-zero, the server has been overloaded at some point.=0A= >>>>>=0A= >>>>> I think the attached patch might work around the problem.=0A= >>>>> The code that should open up the receive window needs to be checked.= =0A= >>>>> I am also looking at enabling the 6minute timeout when a backchannel = is=0A= >>>>> assigned.=0A= >>>>>=0A= >>>>> rick=0A= >>>>>=0A= >>>>> Youssef=0A= >>>>>=0A= >>>>> _______________________________________________=0A= >>>>> freebsd-net@freebsd.org mailing list= =0A= >>>>> https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinf= o/freebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5= OZciLj4ObR_drJNv5yryaERfIbhKR2d$=0A= >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= "=0A= >>>>> =0A= >>>>>=0A= >>>>> =0A= >>>>>=0A= >>>>> _______________________________________________=0A= >>>>> freebsd-net@freebsd.org mailing list=0A= >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= "=0A= >>>>> _______________________________________________=0A= >>>>> freebsd-net@freebsd.org mailing list=0A= >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= "=0A= >>>>=0A= >>>> _______________________________________________=0A= >>>> freebsd-net@freebsd.org mailing list=0A= >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>>> _______________________________________________=0A= >>>> freebsd-net@freebsd.org mailing list=0A= >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >>>=0A= >>=0A= >=0A= > _______________________________________________=0A= > freebsd-net@freebsd.org mailing list=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= =0A= From owner-freebsd-net@freebsd.org Sat Apr 10 09:19:35 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 6A8665E094E for ; Sat, 10 Apr 2021 09:19:35 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2067.outbound.protection.outlook.com [40.107.220.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHTwQ2HgKz4pDd; Sat, 10 Apr 2021 09:19:33 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AZOjHbhxwgegiZ14WBAaFA5aPMq44GQoY+fihzHgohNmvDOyx7+s+zhibD8NHQg+mFZSqRiaep9/ZQBcNlK1Ln84FhV1wGcoI7JmEtXteiCuwPIWP7LJOMT9gUL48hM38J9Z/K7qeAqojaRASbE7ftjMCnEscS+SFLm7mRyXzwCZ1z60CQ1zUMk14q86Bk8DeX5bIyWdtNzdeA2kEj4qGrmJO5eYO1RMnqTahO71XygbwnXq6dfwzTI3XbaJ7iYYGLpwO2feaDBOS3j6hV6/gvXE1CKmaLBW0WLjusDVlusdv2tvZEdAconQ3B1y/pwL2M/VViRCH7IzZk6njLhqEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=icq3c6uni/jTX/HalLxu75mKraOC8uwXjINwSA5Ih8E=; b=heLQU46Zd66tmWRmxMKyqVo67/oG0NQk1grNssNOkqxUmIX1sx0jIJbRBWWt2BW4Wmo9lvt47kZ1dVBRWth2GRiDmpFyRyJ60uMI7njp3xoOUfDXKdZOI14XzUGqLju1KhO4VG4bUOboY2UKkipWXfpGYOkDS4aHZJFUgGEKj720hSps/HmIVQrkGDYttHGJUnH92u3jdi6F5WibrBomSRigAMcjI9Umx5kjCQ0Hkru3BwqWUIryJ7PD9o7ff0f7DFck67ABavx2vlOXHtAWjtyJlz5Rhj757sRlf9kg4cO8jC/p6KXNUeigfraC3ZqZDAqeg3Ec8Jw7yq0CALWI0A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=netapp.com; dmarc=pass action=none header.from=netapp.com; dkim=pass header.d=netapp.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netapp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=icq3c6uni/jTX/HalLxu75mKraOC8uwXjINwSA5Ih8E=; b=Lm2OntH1iHTYv1lt2D1glLyNfyjjvyZ3e+WXNDz5XHlW9ouiLzeNmexpA8qeMjuMzqY5wmaOFIKJZr6+Yvm23ihiVM1vz6A6TjkDdvG5ue3SnHhW2TMKyHfnIKTmoJYbXu2cCp/qD+DpCF01pWiT5Tr7xpGLnoPUNdOPO6OhHJBFg7uluQGpszQtVBuo5IG1z7UULejnfNJvb9+GT47/7ZNXEkyPX9xuKlhglX14xCUrbVXdrPdytXWG5Wzvu3rscy+uY4LQHGCFVqVpc+A5DrugFVXKGG/kBBDIwx1JpjuUF9w6TEANZuMv2tiFDaw8hR3+Wb3XmUBKPhD5llJ3Uw== Received: from SN4PR0601MB3728.namprd06.prod.outlook.com (2603:10b6:803:51::24) by SN4PR0601MB3728.namprd06.prod.outlook.com (2603:10b6:803:51::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.18; Sat, 10 Apr 2021 09:19:30 +0000 Received: from SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef]) by SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef%6]) with mapi id 15.20.4020.021; Sat, 10 Apr 2021 09:19:30 +0000 From: "Scheffenegger, Richard" To: Rick Macklem , "tuexen@freebsd.org" CC: Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: AW: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1GB6agsoGWN0UqRoZFo/qoHTaqMDIkAgAL97ICACMXzgIAAsfOAgAfvbwCAAQ5PAIAAWDiAgAKBMZWAAD3WgIAAFNIAgAA/e4CAABvaAIAAEe2AgAEE0ACAAJCpAIAAgu0AgAXcwwCAAH0H4A== Date: Sat, 10 Apr 2021 09:19:30 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> , In-Reply-To: Accept-Language: de-AT, en-US Content-Language: de-DE X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [217.70.211.16] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: e59dc514-d5ac-4586-fbb4-08d8fc01bf51 x-ms-traffictypediagnostic: SN4PR0601MB3728: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:4714; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: mdj7L/PFDcsXiHLoKDgLHconNF/cqO2SxkqXqUOdEyKc+V97Zllr1ACbz1kkFzzjqCUSv0jJjhAtayHAOrvLbpIcX41mnZydrNM13EovyXdOgz+xkqJHMZaVGH/0Pm4X+YA7W0MxywDEd/5HJlW/0AiNbcxtyybVrcBSfbcq1HRZgSrqArICtdjrJGLFz6Q/xjH9HyC6y7fgZbKGYx1Av8MX2KSKclHc+4MEqrPB6xOPCnGBtTf/igOF7T9SQf4InK4X+35BY5ZkC7syMKr/FsUg1fyQ2h9y1F1vBV2Nw5xswIlnzAkSUxQlbFsleTzmk46FFAT9WS25cqx21ojQ4DLPtUEdNqir8Qu/Jl8tiTCbpYt5kAvPKigkLlvVgBMFu/r4LgywyvAOogPOR18s2GmpMgEfT9j9ZYtNo0eUBs8MVdy0w5JTvj+ZW2AKJSoYSpT/nkPQF9O/mAEuE+IJQKiqsLqAGd3ChVy9FdL2xIrZ0B/WFTdcoWvVhaE6f9Sqtf/17CCBk6POoOJ1vMd4Ms7VabyBFS7braFBV+o5jMrguAi7vJVLs1Y9a/L4W+ikIWNwJZxPAV1OElXfiu3hntbSDsCFTVz6VPhbMpyimLmTRdM86CvHZxTldWoVjEx5BCqw8Do1zX1MUZQz6zRE29xVEkXJ+xPxINDdWM8HtfPlWHT9iyM6PF6yG795c4wr x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN4PR0601MB3728.namprd06.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(376002)(39860400002)(136003)(346002)(396003)(64756008)(66946007)(5660300002)(6506007)(2906002)(966005)(76116006)(9686003)(66556008)(83380400001)(86362001)(66446008)(55016002)(38100700002)(66476007)(52536014)(71200400001)(54906003)(110136005)(26005)(7116003)(33656002)(316002)(4326008)(186003)(8676002)(7696005)(8936002)(478600001)(296002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?us-ascii?Q?7taCmjVeVtk0E57fWr85b+Ie4bbRiu5/huq+kKwDct4CoBFlVZRPIwzaqdqf?= =?us-ascii?Q?X57kSazp/0KjXEvtWRWI5RaBZ8GsXEcwuTV1O7+kkQ6zefw9W2p5dsE5PgQZ?= =?us-ascii?Q?06s/IAa8FlViJ6SqiaUAt5e4VoyzEk4AUlxGRZx44JUjQeEzgH1UI4a9BbOg?= =?us-ascii?Q?t+XGXNVIdKKIaIETg23iH3yMeYVRAJUFXGMqDIGu+Sz63OXrpE9wug9UGyUE?= =?us-ascii?Q?tpWioQmuHZ2Ndrs9yceq0AYabYBIgMNmQmvUueWj3UmYVKXCwiOfCypjkI+r?= =?us-ascii?Q?F9vdR9zS1qHH/33YNbD7E1U0K0xpwrHdc1vVyMTcHEJZT+1b9asfCROmzqw0?= =?us-ascii?Q?d/dvUl8XSwo6n/SfWjaRYoJmZlqH2Up1kBXZf+eOdJtzZY4MyxNGardKvxgV?= =?us-ascii?Q?jboRDGWwtR7H4K4AjbGXACXVQ/KcyrpPpxqeVz+d2jXNKFBtaLu7Q1dd3ZPn?= =?us-ascii?Q?FbdGgSCrdVHMiVtZET0xNgKXBPrV1Nd6HgDoQ6fog5vQwrVuPyr9upIKzaHT?= =?us-ascii?Q?PPhGZTI/mon1zUo7nrkc/eUvKO8j+ETmh+6WjZMi8gHbTHK0+ZlFVfY8RLSU?= =?us-ascii?Q?752F1hZ8VS0ys2LMWy9cQ30PK2WFnSqOYzrH2ikw5Kab3YPUPIvCP0iN9NeA?= =?us-ascii?Q?rRG6IiSmVV7opAVeUrZ21WxFAWFyPP5QhevbuGfsyxUMAVa9Rzd4XdqhyiaH?= =?us-ascii?Q?FoweoU5MlgrYkdG0wjZKSlcaDis1Jg5VtMYUDqqa+CJJXhYwrRSxHT1cWUsS?= =?us-ascii?Q?8QYtC6gf8wrWtmzP0bCVSKARXAQvsDNXWQXKOTex/Iqlpp8ua9Yb5lG7DKeU?= =?us-ascii?Q?N6UpgmOt3u79ICLgNgOs0cPtrGdTpskGRc3agt6/vBBpcW2sulkPXZwdp0cK?= =?us-ascii?Q?z/dqCBqFAvn6GevmDERDTkDKr1FkKwybcBdeK1kAG5YNP3FnM1KRGOt0VIQ7?= =?us-ascii?Q?vWJN+nlSYeLSlt+Y3X3WhiMVi3KwSFu6O9UvY0H1QkbESR0arhQpTY6B11hL?= =?us-ascii?Q?vY+9ZTMfkGNmRO7CJPKBBBHn4pmkRZXo93GjZUm1vvVzOD6m0pkHeYFJUMRX?= =?us-ascii?Q?aP6yLAG4ouPKYLLPNPQf+HsAYqldvW/KJIrWHDdUpwZXa6Cf1JhAirsosmAR?= =?us-ascii?Q?2+t/xn/PbC4gUfGqK8CvxQrBA8UWBeWMo8OKdjID2xBycZuGIiBW+ylK+Smb?= =?us-ascii?Q?E5W5Ilch8X+x69sY1bp9gCOonIZBWlN9wQFjOaNRq/OQeiH1kyuFLL2kNlNi?= =?us-ascii?Q?Nb9oKjjBNhl9Nb3jDon0IYKAowqu7hahn5bOisJG4nG65hZzjJt71CzTGi4g?= =?us-ascii?Q?71kuVZd9X6sThTVv2qdtrOKm?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: netapp.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN4PR0601MB3728.namprd06.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: e59dc514-d5ac-4586-fbb4-08d8fc01bf51 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2021 09:19:30.4844 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 4b0911a0-929b-4715-944b-c03745165b3a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: JrAa0iBQnRlTiSFPgwjhjo6/4tU9+DUdSSa5d2uNrx2zTYtsYpoKlI+Wsq+qYIMrzUFNTODylc82EaohrHEkZQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN4PR0601MB3728 X-Rspamd-Queue-Id: 4FHTwQ2HgKz4pDd X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=netapp.com header.s=selector1 header.b=Lm2OntH1; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=netapp.com; spf=pass (mx1.freebsd.org: domain of Richard.Scheffenegger@netapp.com designates 40.107.220.67 as permitted sender) smtp.mailfrom=Richard.Scheffenegger@netapp.com X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; HAS_XOIP(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[netapp.com:+]; DMARC_POLICY_ALLOW(-0.50)[netapp.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.220.67:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[netapp.com:s=selector1]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[netapp.com:dkim]; SPAMHAUS_ZRD(0.00)[40.107.220.67:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[40.107.220.67:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.220.67:from]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 09:19:35 -0000 Hi Rick, > Well, I have some good news and some bad news (the bad is mostly for Rich= ard). > > The only message logged is: > tcpflags 0x4; tcp_do_segment: Timestamp missing, segment processed n= ormally > > But...the RST battle no longer occurs. Just one RST that works and then t= he SYN gets SYN,ACK'd by the FreeBSD end and off it goes... > > So, what is different? > > r367492 is reverted from the FreeBSD server. > I did the revert because I think it might be what otis@ hang is being cau= sed by. (In his case, the Recv-Q grows on the socket for the stuck Linux cl= ient, while others work. > > Why does reverting fix this? > My only guess is that the krpc gets the upcall right away and sees a EPIP= E when it does soreceive()->results in soshutdown(SHUT_WR). With r367492 you don't get the upcall with the same error state? Or you don= 't get an error on a write() call, when there should be one? >From what you describe, this is on writes, isn't it? (I'm asking, at the or= iginal problem that was fixed with r367492, occurs in the read path (draini= ng of ths so_rcv buffer in the upcall right away, which subsequently influe= nces the ACK sent by the stack). I only added the so_snd buffer after some discussion, if the WAKESOR should= n't have a symmetric equivalent on WAKESOW.... Thus a partial backout (leaving the WAKESOR part inside, but reverting the = WAKESOW part) would still fix my initial problem about erraneous DSACKs (wh= ich can also lead to extremely poor performance with Linux clients), but po= ssible address this issue... Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 for = the revert only on the so_snd upcall? If this doesn't help, some major surgery will be necessary to prevent NFS s= essions with SACK enabled, to transmit DSACKs... > I know from a printf that this happened, but whether it caused the RST ba= ttle to not happen, I don't know. >=20 > I can put r367492 back in and do more testing if you'd like, but I think = it probably needs to be reverted? Please, I don't quite understand why the exact timing of the upcall would b= e that critical here... A comparison of the soxxx calls and errors between the "good" and the "bad"= would be perfect. I don't know if this is easy to do though, as these call= s appear to be scattered all around the RPC / NFS source paths. > This does not explain the original hung Linux client problem, but does sh= ed light on the RST war I could create by doing a network partitioning. > > rick From owner-freebsd-net@freebsd.org Sat Apr 10 11:48:08 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id F19BF5E3270 for ; Sat, 10 Apr 2021 11:48:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 4FHYCr6LQxz3CKZ for ; Sat, 10 Apr 2021 11:48:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id D9BA75E32D2; Sat, 10 Apr 2021 11:48:08 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D984E5E36AC for ; Sat, 10 Apr 2021 11:48:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHYCr5ltFz3CSS for ; Sat, 10 Apr 2021 11:48:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B8D3610C92 for ; Sat, 10 Apr 2021 11:48:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 13ABm8kU030387 for ; Sat, 10 Apr 2021 11:48:08 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 13ABm80J030386 for net@FreeBSD.org; Sat, 10 Apr 2021 11:48:08 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 240944] em(4): Crash with Intel 82571EB NIC with AMD Piledriver and Steamroller APUs Date: Sat, 10 Apr 2021 11:48:07 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.0-RELEASE X-Bugzilla-Keywords: IntelNetworking, crash, needs-qa X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: alin_im@yahoo.com X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: net@FreeBSD.org X-Bugzilla-Flags: mfc-stable12? mfc-stable11? X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 11:48:09 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D240944 Alin changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |alin_im@yahoo.com --- Comment #6 from Alin --- Same issue happens to me :| I am running it on HP T730 with HP NC365T Netwo= rk Controller. 32GB SSD, 2x4GB RAM (brand new)=20 Trying to make it work with pfSense 2.4.5 and 2.5 (FreeBSD 12.2-Stable). Changed different RAM sticks, SSDs, NICs. The only thing I have not changed= is the CPU. Works for about an hour maybe less and then becomes unresponsive and requir= ed hard reboot. Let me know if you found any solution/workaround or I need to repurpose the= box to something else.=20 Thanks :) --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-net@freebsd.org Sat Apr 10 12:13:24 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 4831C5E4FFD for ; Sat, 10 Apr 2021 12:13:24 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHYmz6RQ1z3FP0 for ; Sat, 10 Apr 2021 12:13:23 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:1507:c609:f682:ea59] (unknown [IPv6:2a02:8109:1140:c3d:1507:c609:f682:ea59]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTPSA id C6F9878E8A038; Sat, 10 Apr 2021 14:13:18 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs From: tuexen@freebsd.org In-Reply-To: Date: Sat, 10 Apr 2021 14:13:18 +0200 Cc: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: <8B7C867D-54A5-4EFA-B5BC-CA63FFC1EA77@freebsd.org> References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> To: Rick Macklem X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FHYmz6RQ1z3FP0 X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [0.00 / 15.00]; local_wl_from(0.00)[freebsd.org]; ASN(0.00)[asn:680, ipnet:193.174.0.0/15, country:DE] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 12:13:24 -0000 > On 10. Apr 2021, at 02:44, Rick Macklem wrote: >=20 > tuexen@freebsd.org wrote: >>> On 6. Apr 2021, at 01:24, Rick Macklem wrote: >>>=20 >>> tuexen@freebsd.org wrote: >>> [stuff snipped] >>>> OK. What is the FreeBSD version you are using? >>> main Dec. 23, 2020. >>>=20 >>>>=20 >>>> It seems that the TCP connection on the FreeBSD is still alive, >>>> Linux has decided to start a new TCP connection using the old >>>> port numbers. So it sends a SYN. The response is a challenge ACK >>>> and Linux responds with a RST. This looks good so far. However, >>>> FreeBSD should accept the RST and kill the TCP connection. The >>>> next SYN from the Linux side would establish a new TCP connection. >>>>=20 >>>> So I'm wondering why the RST is not accepted. I made the timestamp >>>> checking stricter but introduced a bug where RST segments without >>>> timestamps were ignored. This was fixed. >>>>=20 >>>> Introduced in main on 2020/11/09: >>>> https://svnweb.freebsd.org/changeset/base/367530 >>>> Introduced in stable/12 on 2020/11/30: >>>> https://svnweb.freebsd.org/changeset/base/36818 >>>>> Fix in main on 2021/01/13: >>>> = https://cgit.FreeBSD.org/src/commit/?id=3Dcc3c34859eab1b317d0f38731355b53f= 7d978c97 >>>> Fix in stable/12 on 2021/01/24: >>>> = https://cgit.FreeBSD.org/src/commit/?id=3Dd05d908d6d3c85479c84c707f9311484= 39ae826b >>>>=20 >>>> Are you using a version which is affected by this bug? >>> I was. Now I've applied the patch. >>> Bad News. It did not fix the problem. >>> It still gets into an endless "ignore RST" and stay established when >>> the Send-Q is empty. >> OK. Let us focus on this case. >>=20 >> Could you: >> 1. sudo sysctl net.inet.tcp.log_debug=3D1 >> 2. repeat the situation where RSTs are ignored. >> 3. check if there is some output on the console (/var/log/messages). >> 4. Either provide the output or let me know that there is none. > Well, I have some good news and some bad news (the bad is mostly for = Richard). > The only message logged is: > tcpflags 0x4; tcp_do_segment: Timestamp missing, segment = processed normally >=20 > But...the RST battle no longer occurs. Just one RST that works and = then > the SYN gets SYN,ACK'd by the FreeBSD end and off it goes... The above is what I would expect if you integrated = cc3c34859eab1b317d0f38731355b53f7d978c97 or reverted r367530. Did you do that? >=20 >=20 > So, what is different? >=20 > r367492 is reverted from the FreeBSD server. Only that? So you still have the bug I introduced in tree, but the RST = segment is accepted? Best regards Michael > I did the revert because I think it might be what otis@ hang is being > caused by. (In his case, the Recv-Q grows on the socket for the > stuck Linux client, while others work. >=20 > Why does reverting fix this? > My only guess is that the krpc gets the upcall right away and sees > a EPIPE when it does soreceive()->results in soshutdown(SHUT_WR). > I know from a printf that this happened, but whether it caused the > RST battle to not happen, I don't know. >=20 > I can put r367492 back in and do more testing if you'd like, but > I think it probably needs to be reverted? >=20 > This does not explain the original hung Linux client problem, > but does shed light on the RST war I could create by doing a > network partitioning. >=20 > rick >=20 > Best regards > Michael >>=20 >> If the Send-Q is non-empty when I partition, it recovers fine, >> sometimes not even needing to see an RST. >>=20 >> rick >> ps: If you think there might be other recent changes that matter, >> just say the word and I'll upgrade to bits de jur. >>=20 >> rick >>=20 >> Best regards >> Michael >>>=20 >>> If I wait long enough before healing the partition, it will >>> go to FIN_WAIT_1, and then if I plug it back in, it does not >>> do battle (at least not for long). >>>=20 >>> Btw, I have one running now that seems stuck really good. >>> It has been 20minutes since I plugged the net cable back in. >>> (Unfortunately, I didn't have tcpdump running until after >>> I saw it was not progressing after healing. >>> --> There is one difference. There was a 6minute timeout >>> enabled on the server krpc for "no activity", which is >>> now disabled like it is for NFSv4.1 in freebsd-current. >>> I had forgotten to re-disable it. >>> So, when it does battle, it might have been the 6minute >>> timeout, which would then do the soshutdown(..SHUT_WR) >>> which kept it from getting "stuck" forever. >>> -->This time I had to reboot the FreeBSD NFS server to >>> get the Linux client unstuck, so this one looked a lot >>> like what has been reported. >>> The pcap for this one, started after the network was plugged >>> back in and I noticed it was stuck for quite a while is here: >>> fetch https://people.freebsd.org/~rmacklem/stuck.pcap >>>=20 >>> In it, there is just a bunch of RST followed by SYN sent >>> from client->FreeBSD and FreeBSD just keeps sending >>> acks for the old segment back. >>> --> It looks like FreeBSD did the "RST, ACK" after the >>> krpc did a soshutdown(..SHUT_WR) on the socket, >>> for the one you've been looking at. >>> I'll test some more... >>>=20 >>>> I would like to understand why the reestablishment of the = connection >>>> did not work... >>> It is looking like it takes either a non-empty send-q or a >>> soshutdown(..SHUT_WR) to get the FreeBSD socket >>> out of established, where it just ignores the RSTs and >>> SYN packets. >>>=20 >>> Thanks for looking at it, rick >>>=20 >>> Best regards >>> Michael >>>>=20 >>>> Have fun with it, rick >>>>=20 >>>>=20 >>>> ________________________________________ >>>> From: tuexen@freebsd.org >>>> Sent: Sunday, April 4, 2021 12:41 PM >>>> To: Rick Macklem >>>> Cc: Scheffenegger, Richard; Youssef GHORBAL; = freebsd-net@freebsd.org >>>> Subject: Re: NFS Mount Hangs >>>>=20 >>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>=20 >>>>=20 >>>>> On 4. Apr 2021, at 17:27, Rick Macklem = wrote: >>>>>=20 >>>>> Well, I'm going to cheat and top post, since this is elated info. = and >>>>> not really part of the discussion... >>>>>=20 >>>>> I've been testing network partitioning between a Linux client (5.2 = kernel) >>>>> and a FreeBSD-current NFS server. I have not gotten a solid hang, = but >>>>> I have had the Linux client doing "battle" with the FreeBSD server = for >>>>> several minutes after un-partitioning the connection. >>>>>=20 >>>>> The battle basically consists of the Linux client sending an RST, = followed >>>>> by a SYN. >>>>> The FreeBSD server ignores the RST and just replies with the same = old ack. >>>>> --> This varies from "just a SYN" that succeeds to 100+ cycles of = the above >>>>> over several minutes. >>>>>=20 >>>>> I had thought that an RST was a "pretty heavy hammer", but FreeBSD = seems >>>>> pretty good at ignoring it. >>>>>=20 >>>>> A full packet capture of one of these is in = /home/rmacklem/linuxtofreenfs.pcap >>>>> in case anyone wants to look at it. >>>> On freefall? I would like to take a look at it... >>>>=20 >>>> Best regards >>>> Michael >>>>>=20 >>>>> Here's a tcpdump snippet of the interesting part (see the *** = comments): >>>>> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 202585:202749, ack = 212293, win 29128, options [nop,nop,TS val 2073636037 ecr 2671204825], = length 164: NFS reply xid 613153685 reply ok 160 getattr NON 4 ids = 0/33554432 sz 0 >>>>> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 202749, win 501, options = [nop,nop,TS val 2671204825 ecr 2073636037], length 0 >>>>> *** Network is now partitioned... >>>>>=20 >>>>> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671204927 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>>> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205135 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>>> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205343 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>>> *** Lots of lines snipped. >>>>>=20 >>>>>=20 >>>>> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>> *** Network is now unpartitioned... >>>>>=20 >>>>> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>>>> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671421871 ecr 0,nop,wscale 7], length 0 >>>>> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073855137 ecr 2671204825], length 0 >>>>> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>>>> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671424943 ecr 0,nop,wscale 7], length 0 >>>>> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073856161 ecr 2671204825], length 0 >>>>> *** This "battle" goes on for 223sec... >>>>> I snipped out 13 cycles of this "Linux sends an RST, followed by = SYN" >>>>> "FreeBSD replies with same old ACK". In another test run I saw = this >>>>> cycle continue non-stop for several minutes. This time, the Linux >>>>> client paused for a while (see ARPs below). >>>>>=20 >>>>> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>>>> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671428975 ecr 0,nop,wscale 7], length 0 >>>>> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073860193 ecr 2671204825], length 0 >>>>> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. >>>>>=20 >>>>> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>>>> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>>>> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>>>> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074046846 ecr 2671204825], length 0 >>>>> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074062406 ecr 2671204825], length 0 >>>>> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [R.], seq 202750, ack 212293, win = 0, options [nop,nop,TS val 2074077981 ecr 2671204825], length 0 >>>>> *** FreeBSD finally acknowledges the RST 38sec after Linux sent = the last >>>>> of 13 (100+ for another test run). >>>>>=20 >>>>> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 4247692373, win 64240, options = [mss 1460,sackOK,TS val 2671667055 ecr 0,nop,wscale 7], length 0 >>>>> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [S.], seq 661237469, ack = 4247692374, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val = 2074098278 ecr 2671667055], length 0 >>>>> *** Now back in business... >>>>>=20 >>>>> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 1, win 502, options = [nop,nop,TS val 2671667055 ecr 2074098278], length 0 >>>>> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 232: NFS = request xid 629930901 228 getattr fh 0,1/53 >>>>> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 272: NFS = request xid 697039765 132 getattr fh 0,1/53 >>>>> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 505, win 29128, options = [nop,nop,TS val 2074098279 ecr 2671667056], length 0 >>>>> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098279], length 136: NFS = request xid 730594197 132 getattr fh 0,1/53 >>>>> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 1:49, ack 505, win = 29128, options [nop,nop,TS val 2074098279 ecr 2671667056], length 48: = NFS reply xid 697039765 reply ok 44 getattr ERROR: unk 10063 >>>>>=20 >>>>> This error 10063 after the partition heals is also "bad news". It = indicates the Session >>>>> (which is supposed to maintain "exactly once" RPC semantics is = broken). I'll admit I >>>>> suspect a Linux client bug, but will be investigating further. >>>>>=20 >>>>> So, hopefully TCP conversant folk can confirm if the above is = correct behaviour >>>>> or if the RST should be ack'd sooner? >>>>>=20 >>>>> I could also see this becoming a "forever" TCP battle for other = versions of Linux client. >>>>>=20 >>>>> rick >>>>>=20 >>>>>=20 >>>>> ________________________________________ >>>>> From: Scheffenegger, Richard >>>>> Sent: Sunday, April 4, 2021 7:50 AM >>>>> To: Rick Macklem; tuexen@freebsd.org >>>>> Cc: Youssef GHORBAL; freebsd-net@freebsd.org >>>>> Subject: Re: NFS Mount Hangs >>>>>=20 >>>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>>=20 >>>>>=20 >>>>> For what it=E2=80=98s worth, suse found two bugs in the linux = nfconntrack (stateful firewall), and pfifo-fast scheduler, which could = conspire to make tcp sessions hang forever. >>>>>=20 >>>>> One is a missed updaten when the c=C3=B6ient is not using the = noresvport moint option, which makes tje firewall think rsts are illegal = (and drop them); >>>>>=20 >>>>> The fast scheduler can run into an issue if only a single packet = should be forwarded (note that this is not the default scheduler, but = often recommended for perf, as it runs lockless and lower cpu cost that = pfq (default). If no other/additional packet pushes out that last packet = of a flow, it can become stuck forever... >>>>>=20 >>>>> I can try getting the relevant bug info next week... >>>>>=20 >>>>> ________________________________ >>>>> Von: owner-freebsd-net@freebsd.org = im Auftrag von Rick Macklem >>>>> Gesendet: Friday, April 2, 2021 11:31:01 PM >>>>> An: tuexen@freebsd.org >>>>> Cc: Youssef GHORBAL ; = freebsd-net@freebsd.org >>>>> Betreff: Re: NFS Mount Hangs >>>>>=20 >>>>> NetApp Security WARNING: This is an external email. Do not click = links or open attachments unless you recognize the sender and know the = content is safe. >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> tuexen@freebsd.org wrote: >>>>>>> On 2. Apr 2021, at 02:07, Rick Macklem = wrote: >>>>>>>=20 >>>>>>> I hope you don't mind a top post... >>>>>>> I've been testing network partitioning between the only Linux = client >>>>>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch >>>>>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >>>>>>> applied to it. >>>>>>>=20 >>>>>>> I'm not enough of a TCP guy to know if this is useful, but = here's what >>>>>>> I see... >>>>>>>=20 >>>>>>> While partitioned: >>>>>>> On the FreeBSD server end, the socket either goes to CLOSED = during >>>>>>> the network partition or stays ESTABLISHED. >>>>>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer = also >>>>>> sent a FIN, but you never called close() on the socket. >>>>>> If the socket stays in ESTABLISHED, there is no communication = ongoing, >>>>>> I guess, and therefore the server does not even detect that the = peer >>>>>> is not reachable. >>>>>>> On the Linux end, the socket seems to remain ESTABLISHED for a >>>>>>> little while, and then disappears. >>>>>> So how does Linux detect the peer is not reachable? >>>>> Well, here's what I see in a packet capture in the Linux client = once >>>>> I partition it (just unplug the net cable): >>>>> - lots of retransmits of the same segment (with ACK) for 54sec >>>>> - then only ARP queries >>>>>=20 >>>>> Once I plug the net cable back in: >>>>> - ARP works >>>>> - one more retransmit of the same segement >>>>> - receives RST from FreeBSD >>>>> ** So, is this now a "new" TCP connection, despite >>>>> using the same port#. >>>>> --> It matters for NFS, since "new connection" >>>>> implies "must retry all outstanding RPCs". >>>>> - sends SYN >>>>> - receives SYN, ACK from FreeBSD >>>>> --> connection starts working again >>>>> Always uses same port#. >>>>>=20 >>>>> On the FreeBSD server end: >>>>> - receives the last retransmit of the segment (with ACK) >>>>> - sends RST >>>>> - receives SYN >>>>> - sends SYN, ACK >>>>>=20 >>>>> I thought that there was no RST in the capture I looked at >>>>> yesterday, so I'm not sure if FreeBSD always sends an RST, >>>>> but the Linux client behaviour was the same. (Sent a SYN, etc). >>>>> The socket disappears from the Linux "netstat -a" and I >>>>> suspect that happens after about 54sec, but I am not sure >>>>> about the timing. >>>>>=20 >>>>>>>=20 >>>>>>> After unpartitioning: >>>>>>> On the FreeBSD server end, you get another socket showing up at >>>>>>> the same port# >>>>>>> Active Internet connections (including servers) >>>>>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = ESTABLISHED >>>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = CLOSED >>>>>>>=20 >>>>>>> The Linux client shows the same connection ESTABLISHED. >>>>> But disappears from "netstat -a" for a while during the = partitioning. >>>>>=20 >>>>>>> (The mount sometimes reports an error. I haven't looked at = packet >>>>>>> traces to see if it retries RPCs or why the errors occur.) >>>>> I have now done so, as above. >>>>>=20 >>>>>>> --> However I never get hangs. >>>>>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server >>>>>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >>>>>>> mount starts working again. >>>>>>>=20 >>>>>>> The most obvious thing is that the Linux client always keeps = using >>>>>>> the same port#. (The FreeBSD client will use a different port# = when >>>>>>> it does a TCP reconnect after no response from the NFS server = for >>>>>>> a little while.) >>>>>>>=20 >>>>>>> What do those TCP conversant think? >>>>>> I guess you are you are never calling close() on the socket, for = with >>>>>> the connection state is CLOSED. >>>>> Ok, that makes sense. For this case the Linux client has not done = a >>>>> BindConnectionToSession to re-assign the back channel. >>>>> I'll have to bug them about this. However, I'll bet they'll answer >>>>> that I have to tell them the back channel needs re-assignment >>>>> or something like that. >>>>>=20 >>>>> I am pretty certain they are broken, in that the client needs to >>>>> retry all outstanding RPCs. >>>>>=20 >>>>> For others, here's the long winded version of this that I just >>>>> put on the phabricator review: >>>>> In the server side kernel RPC, the socket (struct socket *) is in = a >>>>> structure called SVCXPRT (normally pointed to by "xprt"). >>>>> These structures a ref counted and the soclose() is done >>>>> when the ref. cnt goes to zero. My understanding is that >>>>> "struct socket *" is free'd by soclose() so this cannot be done >>>>> before the xprt ref. cnt goes to zero. >>>>>=20 >>>>> For NFSv4.1/4.2 there is something called a back channel >>>>> which means that a "xprt" is used for server->client RPCs, >>>>> although the TCP connection is established by the client >>>>> to the server. >>>>> --> This back channel holds a ref cnt on "xprt" until the >>>>>=20 >>>>> client re-assigns it to a different TCP connection >>>>> via an operation called BindConnectionToSession >>>>> and the Linux client is not doing this soon enough, >>>>> it appears. >>>>>=20 >>>>> So, the soclose() is delayed, which is why I think the >>>>> TCP connection gets stuck in CLOSE_WAIT and that is >>>>> why I've added the soshutdown(..SHUT_WR) calls, >>>>> which can happen before the client gets around to >>>>> re-assigning the back channel. >>>>>=20 >>>>> Thanks for your help with this Michael, rick >>>>>=20 >>>>> Best regards >>>>> Michael >>>>>>=20 >>>>>> rick >>>>>> ps: I can capture packets while doing this, if anyone has a use >>>>>> for them. >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> ________________________________________ >>>>>> From: owner-freebsd-net@freebsd.org = on behalf of Youssef GHORBAL = >>>>>> Sent: Saturday, March 27, 2021 6:57 PM >>>>>> To: Jason Breitman >>>>>> Cc: Rick Macklem; freebsd-net@freebsd.org >>>>>> Subject: Re: NFS Mount Hangs >>>>>>=20 >>>>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> On 27 Mar 2021, at 13:20, Jason Breitman = > = wrote: >>>>>>=20 >>>>>> The issue happened again so we can say that disabling TSO and LRO = on the NIC did not resolve this issue. >>>>>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso = -vlanhwtso >>>>>> # ifconfig lagg0 >>>>>> lagg0: flags=3D8943= metric 0 mtu 1500 >>>>>> = options=3D8100b8 >>>>>>=20 >>>>>> We can also say that the sysctl settings did not resolve this = issue. >>>>>>=20 >>>>>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 >>>>>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1 >>>>>>=20 >>>>>> # sysctl net.inet.tcp.finwait2_timeout=3D1000 >>>>>> net.inet.tcp.finwait2_timeout: 60000 -> 1000 >>>>>>=20 >>>>>> I don=E2=80=99t think those will do anything in your case since = the FIN_WAIT2 are on the client side and those sysctls are for BSD. >>>>>> By the way it seems that Linux recycles automatically TCP = sessions in FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) >>>>>>=20 >>>>>> tcp_fin_timeout (integer; default: 60; since Linux 2.2) >>>>>> This specifies how many seconds to wait for a final FIN >>>>>> packet before the socket is forcibly closed. This is >>>>>> strictly a violation of the TCP specification, but >>>>>> required to prevent denial-of-service attacks. In Linux >>>>>> 2.2, the default value was 180. >>>>>>=20 >>>>>> So I don=E2=80=99t get why it stucks in the FIN_WAIT2 state = anyway. >>>>>>=20 >>>>>> You really need to have a packet capture during the outage = (client and server side) so you=E2=80=99ll get over the wire chat and = start speculating from there. >>>>>> No need to capture the beginning of the outage for now. All you = have to do, is run a tcpdump for 10 minutes or so when you notice a = client stuck. >>>>>>=20 >>>>>> * I have not rebooted the NFS Server nor have I restarted nfsd, = but do not believe that is required as these settings are at the TCP = level and I would expect new sessions to use the updated settings. >>>>>>=20 >>>>>> The issue occurred after 5 days following a reboot of the client = machines. >>>>>> I ran the capture information again to make use of the situation. >>>>>>=20 >>>>>> #!/bin/sh >>>>>>=20 >>>>>> while true >>>>>> do >>>>>> /bin/date >> /tmp/nfs-hang.log >>>>>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log >>>>>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log >>>>>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log >>>>>> /bin/sleep 60 >>>>>> done >>>>>>=20 >>>>>>=20 >>>>>> On the NFS Server >>>>>> Active Internet connections (including servers) >>>>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>>>> tcp4 0 0 NFS.Server.IP.X.2049 = NFS.Client.IP.X.48286 CLOSE_WAIT >>>>>>=20 >>>>>> On the NFS Client >>>>>> tcp 0 0 NFS.Client.IP.X:48286 = NFS.Server.IP.X:2049 FIN_WAIT2 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> You had also asked for the output below. >>>>>>=20 >>>>>> # nfsstat -E -s >>>>>> BackChannelCtBindConnToSes >>>>>> 0 0 >>>>>>=20 >>>>>> # sysctl vfs.nfsd.request_space_throttle_count >>>>>> vfs.nfsd.request_space_throttle_count: 0 >>>>>>=20 >>>>>> I see that you are testing a patch and I look forward to seeing = the results. >>>>>>=20 >>>>>>=20 >>>>>> Jason Breitman >>>>>>=20 >>>>>>=20 >>>>>> On Mar 21, 2021, at 6:21 PM, Rick Macklem = > wrote: >>>>>>=20 >>>>>> Youssef GHORBAL = > wrote: >>>>>>> Hi Jason, >>>>>>>=20 >>>>>>>> On 17 Mar 2021, at 18:17, Jason Breitman = > = wrote: >>>>>>>>=20 >>>>>>>> Please review the details below and let me know if there is a = setting that I should apply to my FreeBSD NFS Server or if there is a = bug fix that I can apply to resolve my issue. >>>>>>>> I shared this information with the linux-nfs mailing list and = they believe the issue is on the server side. >>>>>>>>=20 >>>>>>>> Issue >>>>>>>> NFSv4 mounts periodically hang on the NFS Client. >>>>>>>>=20 >>>>>>>> During this time, it is possible to manually mount from another = NFS Server on the NFS Client having issues. >>>>>>>> Also, other NFS Clients are successfully mounting from the NFS = Server in question. >>>>>>>> Rebooting the NFS Client appears to be the only solution. >>>>>>>=20 >>>>>>> I had experienced a similar weird situation with periodically = stuck Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD = based but they seem to have there >own nfsd) >>>>>> Yes, my understanding is that Isilon uses a proprietary user = space nfsd and >>>>>> not the kernel based RPC and nfsd in FreeBSD. >>>>>>=20 >>>>>>> We=E2=80=99ve had better luck and we did manage to have packet = captures on both sides >during the issue. The gist of it goes like = follows: >>>>>>>=20 >>>>>>> - Data flows correctly between SERVER and the CLIENT >>>>>>> - At some point SERVER starts decreasing it's TCP Receive Window = until it reachs 0 >>>>>>> - The client (eager to send data) can only ack data sent by = SERVER. >>>>>>> - When SERVER was done sending data, the client starts sending = TCP Window >Probes hoping that the TCP Window opens again so he can = flush its buffers. >>>>>>> - SERVER responds with a TCP Zero Window to those probes. >>>>>> Having the window size drop to zero is not necessarily incorrect. >>>>>> If the server is overloaded (has a backlog of NFS requests), it = can stop doing >>>>>> soreceive() on the socket (so the socket rcv buffer can fill up = and the TCP window >>>>>> closes). This results in "backpressure" to stop the NFS client = from flooding the >>>>>> NFS server with requests. >>>>>> --> However, once the backlog is handled, the nfsd should start = to soreceive() >>>>>> again and this shouls cause the window to open back up. >>>>>> --> Maybe this is broken in the socket/TCP code. I quickly got = lost in >>>>>> tcp_output() when it decides what to do about the rcvwin. >>>>>>=20 >>>>>>> - After 6 minutes (the NFS server default Idle timeout) SERVER = racefully closes the >TCP connection sending a FIN Packet (and still a = TCP Window 0) >>>>>> This probably does not happen for Jason's case, since the 6minute = timeout >>>>>> is disabled when the TCP connection is assigned as a backchannel = (most likely >>>>>> the case for NFSv4.1). >>>>>>=20 >>>>>>> - CLIENT ACK that FIN. >>>>>>> - SERVER goes in FIN_WAIT_2 state >>>>>>> - CLIENT closes its half part part of the socket and goes in = LAST_ACK state. >>>>>>> - FIN is never sent by the client since there still data in its = SendQ and receiver TCP >Window is still 0. At this stage the client = starts sending TCP Window Probes again >and again hoping that the server = opens its TCP Window so it can flush it's buffers >and terminate its = side of the socket. >>>>>>> - SERVER keeps responding with a TCP Zero Window to those = probes. >>>>>>> =3D> The last two steps goes on and on for hours/days freezing = the NFS mount bound >to that TCP session. >>>>>>>=20 >>>>>>> If we had a situation where CLIENT was responsible for closing = the TCP Window (and >initiating the TCP FIN first) and server wanting to = send data we=E2=80=99ll end up in the same >state as you I think. >>>>>>>=20 >>>>>>> We=E2=80=99ve never had the root cause of why the SERVER decided = to close the TCP >Window and no more acccept data, the fix on the Isilon = part was to recycle more >aggressively the FIN_WAIT_2 sockets = (net.inet.tcp.fast_finwait2_recycle=3D1 & = >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recycled and at = the next >occurence of CLIENT TCP Window probe, SERVER sends a RST, = triggering the >teardown of the session on the client side, a new TCP = handchake, etc and traffic >flows again (NFS starts responding) >>>>>>>=20 >>>>>>> To avoid rebooting the client (and before the aggressive = FIN_WAIT_2 was >implemented on the Isilon side) we=E2=80=99ve added a = check script on the client that detects >LAST_ACK sockets on the client = and through iptables rule enforces a TCP RST, >Something like: -A OUTPUT = -p tcp -d $nfs_server_addr --sport $local_port -j REJECT >--reject-with = tcp-reset (the script removes this iptables rule as soon as the LAST_ACK = >disappears) >>>>>>>=20 >>>>>>> The bottom line would be to have a packet capture during the = outage (client and/or >server side), it will show you at least the shape = of the TCP exchange when NFS is >stuck. >>>>>> Interesting story and good work w.r.t. sluething, Youssef, = thanks. >>>>>>=20 >>>>>> I looked at Jason's log and it shows everything is ok w.r.t the = nfsd threads. >>>>>> (They're just waiting for RPC requests.) >>>>>> However, I do now think I know why the soclose() does not happen. >>>>>> When the TCP connection is assigned as a backchannel, that takes = a reference >>>>>> cnt on the structure. This refcnt won't be released until the = connection is >>>>>> replaced by a BindConnectiotoSession operation from the client. = But that won't >>>>>> happen until the client creates a new TCP connection. >>>>>> --> No refcnt release-->no refcnt of 0-->no soclose(). >>>>>>=20 >>>>>> I've created the attached patch (completely different from the = previous one) >>>>>> that adds soshutdown(SHUT_WR) calls in the three places where the = TCP >>>>>> connection is going away. This seems to get it past CLOSE_WAIT = without a >>>>>> soclose(). >>>>>> --> I know you are not comfortable with patching your server, but = I do think >>>>>> this change will get the socket shutdown to complete. >>>>>>=20 >>>>>> There are a couple more things you can check on the server... >>>>>> # nfsstat -E -s >>>>>> --> Look for the count under "BindConnToSes". >>>>>> --> If non-zero, backchannels have been assigned >>>>>> # sysctl -a | fgrep request_space_throttle_count >>>>>> --> If non-zero, the server has been overloaded at some point. >>>>>>=20 >>>>>> I think the attached patch might work around the problem. >>>>>> The code that should open up the receive window needs to be = checked. >>>>>> I am also looking at enabling the 6minute timeout when a = backchannel is >>>>>> assigned. >>>>>>=20 >>>>>> rick >>>>>>=20 >>>>>> Youssef >>>>>>=20 >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing = list >>>>>> = https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/fre= ebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZci= Lj4ObR_drJNv5yryaERfIbhKR2d$ >>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>> >>>>>>=20 >>>>>> >>>>>>=20 >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>=20 >>>=20 >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Sat Apr 10 12:19:08 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 63DAC5E5107 for ; Sat, 10 Apr 2021 12:19:08 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHYvc1bbMz3FFf for ; Sat, 10 Apr 2021 12:19:07 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:1507:c609:f682:ea59] (unknown [IPv6:2a02:8109:1140:c3d:1507:c609:f682:ea59]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTPSA id 81C6D70757838; Sat, 10 Apr 2021 14:19:05 +0200 (CEST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs From: tuexen@freebsd.org In-Reply-To: Date: Sat, 10 Apr 2021 14:19:05 +0200 Cc: Rick Macklem , Youssef GHORBAL , "freebsd-net@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org> References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> To: "Scheffenegger, Richard" X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FHYvc1bbMz3FFf X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [0.00 / 15.00]; local_wl_from(0.00)[freebsd.org]; ASN(0.00)[asn:680, ipnet:193.174.0.0/15, country:DE] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 12:19:08 -0000 > On 10. Apr 2021, at 11:19, Scheffenegger, Richard = wrote: >=20 > Hi Rick, >=20 >> Well, I have some good news and some bad news (the bad is mostly for = Richard). >>=20 >> The only message logged is: >> tcpflags 0x4; tcp_do_segment: Timestamp missing, segment = processed normally >>=20 >> But...the RST battle no longer occurs. Just one RST that works and = then the SYN gets SYN,ACK'd by the FreeBSD end and off it goes... >>=20 >> So, what is different? >>=20 >> r367492 is reverted from the FreeBSD server. >> I did the revert because I think it might be what otis@ hang is being = caused by. (In his case, the Recv-Q grows on the socket for the stuck = Linux client, while others work. >>=20 >> Why does reverting fix this? >> My only guess is that the krpc gets the upcall right away and sees a = EPIPE when it does soreceive()->results in soshutdown(SHUT_WR). >=20 > With r367492 you don't get the upcall with the same error state? Or = you don't get an error on a write() call, when there should be one? My understanding is that he needs this error indication when calling = shutdown(). >=20 > =46rom what you describe, this is on writes, isn't it? (I'm asking, at = the original problem that was fixed with r367492, occurs in the read = path (draining of ths so_rcv buffer in the upcall right away, which = subsequently influences the ACK sent by the stack). >=20 > I only added the so_snd buffer after some discussion, if the WAKESOR = shouldn't have a symmetric equivalent on WAKESOW.... >=20 > Thus a partial backout (leaving the WAKESOR part inside, but reverting = the WAKESOW part) would still fix my initial problem about erraneous = DSACKs (which can also lead to extremely poor performance with Linux = clients), but possible address this issue... >=20 > Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 = for the revert only on the so_snd upcall? Since the release of 13.0 is almost done, can we try to fix the issue = instead of reverting the commit? >=20 > If this doesn't help, some major surgery will be necessary to prevent = NFS sessions with SACK enabled, to transmit DSACKs... My understanding is that the problem is related to getting a local error = indication after receiving a RST segment too late or not at all. Best regards Michael >=20 >=20 >> I know from a printf that this happened, but whether it caused the = RST battle to not happen, I don't know. >>=20 >> I can put r367492 back in and do more testing if you'd like, but I = think it probably needs to be reverted? >=20 > Please, I don't quite understand why the exact timing of the upcall = would be that critical here... >=20 > A comparison of the soxxx calls and errors between the "good" and the = "bad" would be perfect. I don't know if this is easy to do though, as = these calls appear to be scattered all around the RPC / NFS source = paths. >=20 >> This does not explain the original hung Linux client problem, but = does shed light on the RST war I could create by doing a network = partitioning. >>=20 >> rick >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Sat Apr 10 14:40:28 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C179A5CE345 for ; Sat, 10 Apr 2021 14:40:28 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) Received: from NAM04-CO1-obe.outbound.protection.outlook.com (mail-co1nam04on0601.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe4d::601]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHd2g3fF9z3Pty; Sat, 10 Apr 2021 14:40:27 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Hyqk5JSCh0cs5ZRQmoFgtGFTX693OddMeJxWkNqAXvbhdLWlWwCUZ+hGN6bt82uBecxF1HQAQmOsfYJRuB7Fd7Fy1FYCvt+D0aRXMjBKMZNVuwtoD6rLULdoGb6OB4DIacZzHfNyL5/IhVcUPbuWjshZjNEjcOl5Ey9e82zf68TkVu+iyCJsXMUFyUj9YW7+wmoMV6+yNmU8N8cgf+4K4OsIDbrcr6Be1qC+1eabR4zYrw72JqulmgfD5FNz14b1+VARoGA5DwzTKok1Rh54zGTZkT4EJxDuSX6Pdvk4Eyf+k1HTwO9/z0+ebICEjvc54Ey8PejOvMcLK2WzCxGivg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lZ0UcAxE56ydXfA0pu9KENj17Z8lLfD2F8rzkTgMF0Q=; b=d3wfJq+fQc6aW+xvDyUu0iDHuJmb0jVvqZIlOW4MMwlfNvAHZhv/lo+BD5j/kymllxJ9ib0E+1EvvYvDZt/Vv6k6b2y3TPK5PGkvD1omjLrKDG7WFbIaK9vH5d6+6vGRQRNMyxzGsV7Y6Oqv4P6U6pPkN+VcYgFv31XohW0u8yRxlo8mATWZZjCsgMg6reFcDxpoTHgzmh8lvmAcZWUZiTZLA4BZ1zA1so4OlAAds0PRpu/vk3ky8aYSU0lYRhKDHQ4Kjh7HyHtvbWbuqR3qkssZbYZ8HG18dCOgUMdEwpozGHKpJWgwoOSnO3MHUy+hOkep8ASE+J5bK4jVlEvXGw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=netapp.com; dmarc=pass action=none header.from=netapp.com; dkim=pass header.d=netapp.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netapp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lZ0UcAxE56ydXfA0pu9KENj17Z8lLfD2F8rzkTgMF0Q=; b=RxDPGAwKx3eZHgAWKK+Lk24fSFD2t9o/OFJehFYtf6wFTHtJBnMxNVCLKxG4QHaNufuIyLqi22+MNjJOwnjSjHcGYQU32HoF49XRL+19UqtGeCftzvWOE/T4WYBq3mm/U0mAHxw6s3wPghqnk97V0i76xpuZNZWfZXK2jiMfOQ9VSrFR7ab0g4TUqN8sp8B4hgOCLqU+NtGf/UboUNoGr6b2wQ7gEuHVnzXpQYFbUGDGL2s/cpLdi91BUSnWZdvqzWnnf/qmC6iPC+Zp0PYXY+MtQJ3KC6arI9ITX7v3zWGZD4x6tdktHbgJxh1ZFjYn1i58FrPtqs4TVtqWCO+VYw== Received: from SN4PR0601MB3728.namprd06.prod.outlook.com (2603:10b6:803:51::24) by SN6PR06MB4621.namprd06.prod.outlook.com (2603:10b6:805:91::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.28; Sat, 10 Apr 2021 14:40:24 +0000 Received: from SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef]) by SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef%6]) with mapi id 15.20.4020.021; Sat, 10 Apr 2021 14:40:24 +0000 From: "Scheffenegger, Richard" To: "tuexen@freebsd.org" CC: Rick Macklem , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1GB6agsoGWN0UqRoZFo/qoHTaqMDIkAgAL97ICACMXzgIAAsfOAgAfvbwCAAQ5PAIAAWDiAgAKBMZWAAD3WgIAAFNIAgAA/e4CAABvaAIAAEe2AgAEE0ACAAJCpAIAAgu0AgAXcwwCAAH0H4IAARSaAgAAmg3g= Date: Sat, 10 Apr 2021 14:40:24 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> , <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org> In-Reply-To: <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org> Accept-Language: de-AT, en-US Content-Language: de-AT X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [2001:4bb8:11a:b40f:c82c:3a7f:c649:8f03] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 381f128e-46ad-49e7-189b-08d8fc2e935a x-ms-traffictypediagnostic: SN6PR06MB4621: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:4502; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: dM7QqOJle2mEsHw497MEFMQw/2UczI7JNYijuxagLR+2+9t9Uay9I6GX8W0erIkAkUjhYaSI/9uDzp0deLyHurgQtTJcEuw8p8IlEnGL73M9nbD/TQLnBTb73nJi74Hpy5aFhC/dGQYbfythj8vG8pnkV0s0KWfOH1WyGi4PAXCVprhKlgX4zVmWZnpN3v3lDxVni2dN3a/V6juV/d+jcRRmskjRqx0Uuj5RfyH179Ghvv6qBmInzYMB0B6jQuoFhWX1i4mmVmYXP2WcKk/AxI/ZIurrRM92dl6quI8P1lMsMGgynuqYvejsjXrPyFSpqEpkTwN8Xx9X4QFvXNSk57xTeFjVxl3hhtmtykzOslPTYv8dRX2XdQwP9zFujusqpmSfjwH41FLXtCWRGIm207SxxIrt0tnqih1vAJRSgAqboeFqS9auWGh0EJIi10Bz4PbwKxtrmxBfQxZTEqtREM4jaE+qHysJQ03/FAXjbPVvfN29PvdUiF27mKqXYekb1Oic5YADWxCJ01LmnXVWbJUeRjejasIh+OHe91mGfV+BsU3geLuwN1KJSddPrK0ovgvu+Eq9nk2N+1/5Q22CCv6U7HK5pSFGqXTf1m/F0MYNuGw5+nF7cJUWTTzt/J+hPBSHrGxr0gkn4KiAOhBFEyKF5nBdXpilMpSa0SQODbiQLWR5jasbvx9+lG7oLE21 x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN4PR0601MB3728.namprd06.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(136003)(376002)(39860400002)(346002)(396003)(166002)(54906003)(64756008)(966005)(8936002)(71200400001)(55016002)(66476007)(66556008)(66946007)(66446008)(5660300002)(91956017)(478600001)(52536014)(2906002)(9686003)(6506007)(7696005)(76116006)(8676002)(186003)(6916009)(86362001)(33656002)(316002)(38100700002)(7116003)(3480700007)(4326008)(83380400001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?Riy+i3o4swjXEwb64jAbH99Al1Gs4svILld01UHc13l98mfoGTBFbBU5?= =?Windows-1252?Q?TVrvzkycp78g0/BzuEN5QeoQ022dvl9RX5m1CNZJKO695UpHEE0Vb+mw?= =?Windows-1252?Q?kYrUvJIxx6N531K5s+J9zzHcWPeaDAIMP/Yf070J1JbGmwVp6zAV3Jry?= =?Windows-1252?Q?P4rfpviWqAcbXgSNmRlR8RxxJwNhJLSG6hjeOU7CBx+SAIOYG5dXp3/9?= =?Windows-1252?Q?wr2kT+Pjkonsj3advNQ5w6WJh2VCPAZFwo7jP0wqKFwib5fTlTAnre04?= =?Windows-1252?Q?SFbiNOFPM+tdos8ufH9mxH767AP9eTDNJ08/jVFOFmHzopl1X9chYVWD?= =?Windows-1252?Q?OSKKZGJI97HmEx85fqpTHt2oAaTnFTgjW6c4uXwVdxBcO4Sh8j+hMsqe?= =?Windows-1252?Q?hP5BgCatWyCrCFL2EYgpEDUSrhLHm3pTV1h1kEA8rxW6KYgweqTWcXzb?= =?Windows-1252?Q?Y9FCo25m5I3iVBlrHj+77Ev0YqETE/war+K1T4DRE1yDVwvFzQqFHEeq?= =?Windows-1252?Q?7bE3DjjQ11SgZ9sRKcx2p+8fg23/g+Fzapfu7RhkP0ymu5E3RCjGiQV/?= =?Windows-1252?Q?qxKE5VnBMsGcrDuU/vYd0YXFDyfNmxrdr9SNlw6hZSZb8lRTmOFLTGa6?= =?Windows-1252?Q?J2FbuNdavsbzz2GwpNTeXmYcpf6/vPvpKaDRsLWRZqd85jyqr7V/NunX?= =?Windows-1252?Q?ECfWTi98w5CCz0SKRtRU5h48KFObPtgrMLBY4qKMlfnxCCKY7gBQemjA?= =?Windows-1252?Q?sK2wHMFwuikNVHEdQyk+s3NJwzgs2xQWxG2NkOpVMZ3AULRR1sh35FEi?= =?Windows-1252?Q?ySQp6xCgYr6NBcdYlL0rIvOxmRUYxIEQBILPOE70UQpXJHIiNayEvVMR?= =?Windows-1252?Q?wp5douHkFOAR8H88szRcY4LF1d/HnxK4NZQBqG8mpTVDItRFB+kiri4W?= =?Windows-1252?Q?RTJokoii7e1i6QnVeCERolWb3wE1VsqiDihjfjBqxSYYv2nX9FNcc8Qx?= =?Windows-1252?Q?zCnSv/7aMt/eriEHMah5gGj0+2Kg1puDm6nSjSRvDjRJMe+o8jjs8utn?= =?Windows-1252?Q?T86/LR07V3Je+TSbVEQVWz7IsMyMBLnn3bTRTuBdklAPCkmmDkY0I3JX?= =?Windows-1252?Q?SeO6UavSSqCg+auFCBbzQCJP/TSkAMYCS1V6XF4Iq285q2M03TENFJBd?= =?Windows-1252?Q?GuEapw8M5bPFdqnIKb45+9vqcuKaD4gO7SUp1x94Dye2hn9zhjz3ZEMI?= =?Windows-1252?Q?luGcyx9BXU88mzAyePQRu4bKaQsubVTv7Ysj9/HG4ECGUPQJ3VHKv2rf?= =?Windows-1252?Q?BiUcH4i8tZSboU/cXhivp8VDIWiyFBSSXY5yGpslCDAFqCuoxQze3pKB?= =?Windows-1252?Q?z8X8TTZvXPIId3rrC9swz1fgXxY4urPgKjjqTruN6KR9fsNDa818Oq6m?= =?Windows-1252?Q?BTzoD+N6VKw7wRZ8ZznAxyDlYVfb06ywLOLVcuxReHhSKsdbKZJdaQdX?= =?Windows-1252?Q?MZlwdySOGueHJCGMjbEhB6Kq7E90MQ=3D=3D?= x-ms-exchange-transport-forked: True MIME-Version: 1.0 X-OriginatorOrg: netapp.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN4PR0601MB3728.namprd06.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 381f128e-46ad-49e7-189b-08d8fc2e935a X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2021 14:40:24.2086 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 4b0911a0-929b-4715-944b-c03745165b3a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: +6gUlSzk6wUGxPF6aEPx6ZyjbGOeT/BpBwXu7z86QE+V7zYEXlPuASRLFDvctWHGNtUbvYeAX83NITyIfQmZGg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR06MB4621 X-Rspamd-Queue-Id: 4FHd2g3fF9z3Pty X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=netapp.com header.s=selector1 header.b=RxDPGAwK; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=netapp.com; spf=pass (mx1.freebsd.org: domain of Richard.Scheffenegger@netapp.com designates 2a01:111:f400:fe4d::601 as permitted sender) smtp.mailfrom=Richard.Scheffenegger@netapp.com X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe4d::601:from]; R_DKIM_ALLOW(-0.20)[netapp.com:s=selector1]; HAS_XOIP(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe4d::601:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DWL_DNSWL_LOW(-1.00)[netapp.com:dkim]; DKIM_TRACE(0.00)[netapp.com:+]; DMARC_POLICY_ALLOW(-0.50)[netapp.com,none]; NEURAL_SPAM_LONG(1.00)[1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-net] Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 14:40:28 -0000 ________________________________ Von: tuexen@freebsd.org Gesendet: Samstag, April 10, 2021 2:19 PM An: Scheffenegger, Richard Cc: Rick Macklem; Youssef GHORBAL; freebsd-net@freebsd.org Betreff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or o= pen attachments unless you recognize the sender and know the content is saf= e. > On 10. Apr 2021, at 11:19, Scheffenegger, Richard wrote: > > Hi Rick, > >> Well, I have some good news and some bad news (the bad is mostly for Ric= hard). >> >> The only message logged is: >> tcpflags 0x4; tcp_do_segment: Timestamp missing, segment processed = normally >> >> But...the RST battle no longer occurs. Just one RST that works and then = the SYN gets SYN,ACK'd by the FreeBSD end and off it goes... >> >> So, what is different? >> >> r367492 is reverted from the FreeBSD server. >> I did the revert because I think it might be what otis@ hang is being ca= used by. (In his case, the Recv-Q grows on the socket for the stuck Linux c= lient, while others work. >> >> Why does reverting fix this? >> My only guess is that the krpc gets the upcall right away and sees a EPI= PE when it does soreceive()->results in soshutdown(SHUT_WR). > > With r367492 you don't get the upcall with the same error state? Or you d= on't get an error on a write() call, when there should be one? My understanding is that he needs this error indication when calling shutdo= wn(). > > From what you describe, this is on writes, isn't it? (I'm asking, at the = original problem that was fixed with r367492, occurs in the read path (drai= ning of ths so_rcv buffer in the upcall right away, which subsequently infl= uences the ACK sent by the stack). > > I only added the so_snd buffer after some discussion, if the WAKESOR shou= ldn't have a symmetric equivalent on WAKESOW.... > > Thus a partial backout (leaving the WAKESOR part inside, but reverting th= e WAKESOW part) would still fix my initial problem about erraneous DSACKs (= which can also lead to extremely poor performance with Linux clients), but = possible address this issue... > > Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 fo= r the revert only on the so_snd upcall? Since the release of 13.0 is almost done, can we try to fix the issue inste= ad of reverting the commit? Rs: agree, a good understanding where the interaction btwn stack, socket an= d in kernel tcp user breaks is needed; > > If this doesn't help, some major surgery will be necessary to prevent NFS= sessions with SACK enabled, to transmit DSACKs... My understanding is that the problem is related to getting a local error in= dication after receiving a RST segment too late or not at all. Rs: but the move of the upcall should not materially change that; i don=92t= have a pc here to see if any upcall actually happens on rst... Best regards Michael > > >> I know from a printf that this happened, but whether it caused the RST b= attle to not happen, I don't know. >> >> I can put r367492 back in and do more testing if you'd like, but I think= it probably needs to be reverted? > > Please, I don't quite understand why the exact timing of the upcall would= be that critical here... > > A comparison of the soxxx calls and errors between the "good" and the "ba= d" would be perfect. I don't know if this is easy to do though, as these ca= lls appear to be scattered all around the RPC / NFS source paths. > >> This does not explain the original hung Linux client problem, but does s= hed light on the RST war I could create by doing a network partitioning. >> >> rick > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Sat Apr 10 15:04:21 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id F081E5CEE52 for ; Sat, 10 Apr 2021 15:04:21 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660089.outbound.protection.outlook.com [40.107.66.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHdZD5rsdz3R51; Sat, 10 Apr 2021 15:04:19 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kpLDIl+A7cxNg43S/Q1Uo1MjUr3XSq4piCvdHnWdYoU7t7qfZrV9mhLzJ3wngv3xAiM6waBHgFpGSsLcAWWPRRP6Ouivds4Vx5uAR4w95wOhDNoFBbatm/u5rcFH7fM+4jFXkP2LPYYpgvuwzMgCIGWOYQMiDq+AQiEkUS6J9Dmv1m5cs402RewwyOHLVc9gSIqPahD6yNHMb9WRG1Vy4qkbiZ2EGMg58k2qM1XQmRwxvbssAiq4twg/+2iKvEJlv+Hwy1IlF0popnjq8OLqoxndp1KJFZjVKGUWXtzMc1GfFFnksVEOJ133KOTb0ZgOCSxfMZQQAXFzXQyxoy7wYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=82y3NEa+Seba5TIuBkmj3SPxJ9CHExvMbME+s6EQaVg=; b=JCcWChs3yyHqEysQUssOU/wkVxfe5j+9pU//lOmCN5s73QBoZeTJTrtQZFsJ+F14gnMNj+aGW9prs8SMwvZvQAgPjqZdb6OyvKO0ysSkCe3XrXvY9H4wAP/iW3wRFYO7hzfiFr9NivYW0cBF7An4Cu0eBpA2fM9hz/yrBHRJ0+FNw7SejFImYTY6kNkO9oPqWrgvnTsXq4ixkYt03jMkDzAqEKgS/lIwOqqJlRXBO5z29ZvhbREEE1XJn51W4FJjmdusDQgtUksB81kSj2CJ4LBC8ENEXDkXLVy/wMbhwZMt5KG7dGFg5SKghYZWwnSwa9JHjlrvfoOfTjHIXXOdGg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=82y3NEa+Seba5TIuBkmj3SPxJ9CHExvMbME+s6EQaVg=; b=qpG4TR9/DsjU7KoFh+UpfxaEWXNHySLPl9I3YKOyB+eipKxc7GDgkfmtuIgl3OBWP9L3gPRqXOfB1/Kau76gv9+CuAhwV+QrGfKGFSnvLlsnqY+Xl+NXpccSE/ax/GUPDfBFilLpXAg+oEGa20+DdApHAbTh61JbR4lnchBZ8AolYpfQDEYb+wbT4Os13VJdM6v+xesP4NMVU2E6A1dTsyoO2se/tgHCQRVgpZi7dN4QJrnUMLBjdzrtO5X7vScGADBclsjHCTLZjQuB5nDJ2weak0oPLLEeetsUPbsOE6nQ8urr0hRJ8O//HfqpOM0ZrAo6OII2T9TqNOJY4FTtPg== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQBPR0101MB4665.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c01:14::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.22; Sat, 10 Apr 2021 15:04:16 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.035; Sat, 10 Apr 2021 15:04:16 +0000 From: Rick Macklem To: "tuexen@freebsd.org" CC: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAG5EAgAA+DUKAAB1JAIAACEqkgAEOcgCAAI4UZoAAhYMAgAXXgNmAAMXSAIAAJRZW Date: Sat, 10 Apr 2021 15:04:16 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> , <8B7C867D-54A5-4EFA-B5BC-CA63FFC1EA77@freebsd.org> In-Reply-To: <8B7C867D-54A5-4EFA-B5BC-CA63FFC1EA77@freebsd.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: fed4fa2d-7e14-437f-def9-08d8fc31e8f8 x-ms-traffictypediagnostic: YQBPR0101MB4665: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:3513; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: NOZwuzddeudhPYvUkQ4fE8k5AJPlPMB8W76UGSpwD9i1Cd5JF4njW2P3Sx+iG7t4Og+JQsjn3OVDgplQK1gjffPnkN6n+zDSnTyibg28T4HTJnJBhqBrR3a+ArHRjSMLCCSIKNc8zBOTli0GDjHzLUD7x3Fmwt5uaHya5AHO5aWtffc5No8s5dV8Cn3ehrwKI1QS8H6FK9yCwTJ/yg5lGx25LCxkTIwWaUXS1GZi8q3g31X847/n5DW0mWz4Oxcb707YM2jXL8uDQlL9DYrb+tYhJ9VY4wTviMhUXZ+B0we9HXgPoTBTm0L+aHRy8iq4nEEusjC8QmmWLxYdm2qASLb7tBYDtlAvV7UunYeYAm8tRe8dhq9f+zXYpw5X9X/myNzjMNByUBGTHwzNplmzPok9nN5VXRqVaXMXuE9HBBpZnl5hrYYBWJmWJKWv+QzYWDzWAyaAnyqsH7OjC8vJd0dcPaIkPyO40o0AVPy2tvCLQCDNsQDHfd/gAmUEnZtzEYUw5LK/P4F6eFZy38FzA4UeCGriahNr1JV4VL1U+tO5/+iAgaGWcCvLc0CURbDNYRKDmELbocySZt6KpMK996TNoqZMXhMt6HzybtNHHBq/lSskGvZ7TtKO+g30mNYOYxiAvvOTQTR4bYdKBQbi7OiXqSvSwXqDXKiJW7XaeNY= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(136003)(39860400002)(366004)(376002)(396003)(346002)(7116003)(66556008)(4326008)(66476007)(66446008)(64756008)(6916009)(5660300002)(30864003)(2906002)(66946007)(66574015)(966005)(83380400001)(3480700007)(786003)(8936002)(316002)(54906003)(6506007)(71200400001)(478600001)(53546011)(7696005)(76116006)(91956017)(55016002)(52536014)(9686003)(33656002)(38100700002)(86362001)(186003)(8676002)(559001)(579004); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?doDdZaRbOPj5Gw8V2fkEQWa3gXa0uNbWsAO6DdPxuxO6PmHBOdQY4ia9?= =?Windows-1252?Q?nXOFkWQUZ2YkuqHIu4quOVkhTeT/rui3Tog/coRQihV2FDLo9vuoiu3Z?= =?Windows-1252?Q?SoV8ZiTymY3mZaMn8ZFBii4zofudT3tAjgodFYl0IKY4sMe4IV6szp7u?= =?Windows-1252?Q?G0qMa/uwCUC9hU8tLAULFkkXYCTpIpsCg6DHmlqPxDQ2vPLCzXnZDnJO?= =?Windows-1252?Q?6V79td0zDWvlC437Yb2MDnB8FZgI8dXh2YKW+E06/15aTQQe8CepAQlW?= =?Windows-1252?Q?u7J6Zo8oSwMplAjd/nyDKEo11K5Up+uVT2BciY169GxKtKijTGdMc02K?= =?Windows-1252?Q?RgbabbjZbTBw/hrQlTu0Qjqop+emtkDJuduuNo0KJLyhPLZ+NsbIN1DM?= =?Windows-1252?Q?8UyUdUGDGxdDtjyzzJPpmtyvIIV5fKt43U7NfaPCR+hPAooJ4icw/a4+?= =?Windows-1252?Q?AymABBzkoAfCqxU1ZWc6JB5+SKeYUbO88dR77kCxfVCQGfrrMZW2LRFO?= =?Windows-1252?Q?RCmW9+8qtHpb3kNK5TQlBiu+O4cZXUC2Jr50XKSZa3KLOT7+BfNwhKzf?= =?Windows-1252?Q?gE0HHbSxSU9ZJYxWSAzRrvkd91aiXQWq5U6l8kGtszk4rWGwsHHz6Lo/?= =?Windows-1252?Q?H3bEhrvSmJE1NID/gNwaBBgSuticL+mUvH5Dp27bgEpDhGK1j9ucj10T?= =?Windows-1252?Q?4c3ZDr0ImPjxjpMjSO/C7M7brHmvkRZ67eI9lOZnCrh+BVo5DYl4nvas?= =?Windows-1252?Q?pjc/H82dW8cJ/EVYK1sTbaoOFDPAFEkOt0flXIiL/lpzO8mclP23ZzD8?= =?Windows-1252?Q?tnZJFdOamj4JJhHa7VZu6nPEkTVrg1Ct5JFAGKfr9/+pURUq2piUQSAE?= =?Windows-1252?Q?+kGxJDzGuubpUmBbmPslMNlAov38sVL3HH01yH8TeXkXN4zIRDpccIGc?= =?Windows-1252?Q?PD+U0KfO7lyF2zGYrrFOxB71/v9A5kypgp4GderM0Y330ka9eQKcwcty?= =?Windows-1252?Q?fBvWYdRxRGdtP49x6aMZgzUPf2IquLc1IpFz+rwgRSzsgeJIScsqbxuP?= =?Windows-1252?Q?rZY+DVXh4HpNjgqEY0qeprNRlJU5RoRICkS6QJekL3WgKORWqbEYAYTe?= =?Windows-1252?Q?aiABj/bUDi8B0e1n2DyH93FhTv812uFYs178hR9VvB+5CDMGnfdREByX?= =?Windows-1252?Q?UDXaezlPSTAzl1TDfirmab/ru1Cq6wmLD2HEdpZme+yD6jskGkyI5t9E?= =?Windows-1252?Q?8V8IRhHcdIW9gRBUVnMlfb7dAhbrUUgmFQL/EAzyN3WVWur2clJQJuf1?= =?Windows-1252?Q?cQE/d3wc/O30zHgSYW51fP9ShLEDEGBz6mF4NyTdtGuTXoeq4ql7vGNt?= =?Windows-1252?Q?21T3g3oEa7EXqbgyVo2/KDL/n+EFH3iOLhWJlwqxFsr7B+TiM1OOBRwE?= =?Windows-1252?Q?CypuBa4nH/Wfbdj6/RSeuSnFkrHpv6ZbILIzfgN9U9g=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: fed4fa2d-7e14-437f-def9-08d8fc31e8f8 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2021 15:04:16.3687 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: yQfJJhMtawJGmMvY67U2baWy49UKL/df+6OSbgDQud93bNs3jzjuEbXBb3dyYqbo1Hs+G93RfnCR+8P46qNBNg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB4665 X-Rspamd-Queue-Id: 4FHdZD5rsdz3R51 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=qpG4TR9/; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.89 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.66.89:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; MIME_GOOD(-0.10)[text/plain]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; SPAMHAUS_ZRD(0.00)[40.107.66.89:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[40.107.66.89:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.66.89:from]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 15:04:22 -0000 tuexen@freebsd.org wrote:=0A= >> On 10. Apr 2021, at 02:44, Rick Macklem wrote:=0A= >>=0A= >> tuexen@freebsd.org wrote:=0A= >>>> On 6. Apr 2021, at 01:24, Rick Macklem wrote:= =0A= >>>>=0A= >>>> tuexen@freebsd.org wrote:=0A= >>>> [stuff snipped]=0A= >>>>> OK. What is the FreeBSD version you are using?=0A= >>>> main Dec. 23, 2020.=0A= >>>>=0A= >>>>>=0A= >>>>> It seems that the TCP connection on the FreeBSD is still alive,=0A= >>>>> Linux has decided to start a new TCP connection using the old=0A= >>>>> port numbers. So it sends a SYN. The response is a challenge ACK=0A= >>>>> and Linux responds with a RST. This looks good so far. However,=0A= >>>>> FreeBSD should accept the RST and kill the TCP connection. The=0A= >>>>> next SYN from the Linux side would establish a new TCP connection.=0A= >>>>>=0A= >>>>> So I'm wondering why the RST is not accepted. I made the timestamp=0A= >>>>> checking stricter but introduced a bug where RST segments without=0A= >>>>> timestamps were ignored. This was fixed.=0A= >>>>>=0A= >>>>> Introduced in main on 2020/11/09:=0A= >>>>> https://svnweb.freebsd.org/changeset/base/367530=0A= >>>>> Introduced in stable/12 on 2020/11/30:=0A= >>>>> https://svnweb.freebsd.org/changeset/base/36818=0A= >>>>>> Fix in main on 2021/01/13:=0A= >>>>> https://cgit.FreeBSD.org/src/commit/?id=3Dcc3c34859eab1b317d0f3873135= 5b53f7d978c97=0A= >>>>> Fix in stable/12 on 2021/01/24:=0A= >>>>> https://cgit.FreeBSD.org/src/commit/?id=3Dd05d908d6d3c85479c84c707f93= 1148439ae826b=0A= >>>>>=0A= >>>>> Are you using a version which is affected by this bug?=0A= >>>> I was. Now I've applied the patch.=0A= >>>> Bad News. It did not fix the problem.=0A= >>>> It still gets into an endless "ignore RST" and stay established when= =0A= >>>> the Send-Q is empty.=0A= >>> OK. Let us focus on this case.=0A= >>>=0A= >>> Could you:=0A= >>> 1. sudo sysctl net.inet.tcp.log_debug=3D1=0A= >>> 2. repeat the situation where RSTs are ignored.=0A= >>> 3. check if there is some output on the console (/var/log/messages).=0A= >>> 4. Either provide the output or let me know that there is none.=0A= >> Well, I have some good news and some bad news (the bad is mostly for Ric= hard).=0A= >> The only message logged is:=0A= >> tcpflags 0x4; tcp_do_segment: Timestamp missing, segment processed= normally=0A= >>=0A= >> But...the RST battle no longer occurs. Just one RST that works and then= =0A= >> the SYN gets SYN,ACK'd by the FreeBSD end and off it goes...=0A= >The above is what I would expect if you integrated cc3c34859eab1b317d0f387= 31355b53f7d978c97=0A= >or reverted r367530. Did you do that?=0A= r367530 is in the kernel that does not cause the "RST battle".=0A= =0A= >=0A= >=0A= > So, what is different?=0A= >=0A= > r367492 is reverted from the FreeBSD server.=0A= Only that? So you still have the bug I introduced in tree, but the RST segm= ent is accepted?=0A= No. The kernel being tested includes the fix (you committed mid-January) fo= r the bug=0A= that went in in Nov.=0A= However, adding the mid-January patch did not fix the problem.=0A= Then reverting r367492 (and only r367492) made the problem go away.=0A= =0A= I did not expect reverting r367492 to affect this.=0A= I reverted r367492 because otis@ gets Linux client mounts "stuck" against a= FreBSD13=0A= NFS server, where the Recv-Q size grows and the client gets no RPC replies.= Other=0A= clients are still working fine. I can only think of one explanations for th= is:=0A= - An upcall gets missed or occurs at the wrong time.=0A= --> Since what this patch does is move where the upcalls is done, it is the= logical=0A= culprit.=0A= Hopefully otis@ will be able to determine if reverting r367492 fixes = the problem.=0A= This will take weeks, since the problem recently took two weeks to re= cur.=0A= --> This would be the receive path, so reverting the send path would = not be=0A= relevant.=0A= *** I'd like to hear from otis@ before testing a "send path only" revert.= =0A= --> Also, it has been a long time since I worked on the socket upcall code,= but I=0A= vaguely remember that the upcalls needed to be done before SOCKBUF_LO= CK()=0A= is dropped to ensure that the socket buffer is in the expected state.= =0A= r367492 drops SOCKBUF_LOCK() and then picks it up again for the upcal= ls.=0A= =0A= I'll send you guys the otis@ problem email. (I don't think that one is cc'd= to a list.=0A= =0A= rick=0A= =0A= Best regards=0A= Michael=0A= > I did the revert because I think it might be what otis@ hang is being=0A= > caused by. (In his case, the Recv-Q grows on the socket for the=0A= > stuck Linux client, while others work.=0A= >=0A= > Why does reverting fix this?=0A= > My only guess is that the krpc gets the upcall right away and sees=0A= > a EPIPE when it does soreceive()->results in soshutdown(SHUT_WR).=0A= > I know from a printf that this happened, but whether it caused the=0A= > RST battle to not happen, I don't know.=0A= >=0A= > I can put r367492 back in and do more testing if you'd like, but=0A= > I think it probably needs to be reverted?=0A= >=0A= > This does not explain the original hung Linux client problem,=0A= > but does shed light on the RST war I could create by doing a=0A= > network partitioning.=0A= >=0A= > rick=0A= >=0A= > Best regards=0A= > Michael=0A= >>=0A= >> If the Send-Q is non-empty when I partition, it recovers fine,=0A= >> sometimes not even needing to see an RST.=0A= >>=0A= >> rick=0A= >> ps: If you think there might be other recent changes that matter,=0A= >> just say the word and I'll upgrade to bits de jur.=0A= >>=0A= >> rick=0A= >>=0A= >> Best regards=0A= >> Michael=0A= >>>=0A= >>> If I wait long enough before healing the partition, it will=0A= >>> go to FIN_WAIT_1, and then if I plug it back in, it does not=0A= >>> do battle (at least not for long).=0A= >>>=0A= >>> Btw, I have one running now that seems stuck really good.=0A= >>> It has been 20minutes since I plugged the net cable back in.=0A= >>> (Unfortunately, I didn't have tcpdump running until after=0A= >>> I saw it was not progressing after healing.=0A= >>> --> There is one difference. There was a 6minute timeout=0A= >>> enabled on the server krpc for "no activity", which is=0A= >>> now disabled like it is for NFSv4.1 in freebsd-current.=0A= >>> I had forgotten to re-disable it.=0A= >>> So, when it does battle, it might have been the 6minute=0A= >>> timeout, which would then do the soshutdown(..SHUT_WR)=0A= >>> which kept it from getting "stuck" forever.=0A= >>> -->This time I had to reboot the FreeBSD NFS server to=0A= >>> get the Linux client unstuck, so this one looked a lot=0A= >>> like what has been reported.=0A= >>> The pcap for this one, started after the network was plugged=0A= >>> back in and I noticed it was stuck for quite a while is here:=0A= >>> fetch https://people.freebsd.org/~rmacklem/stuck.pcap=0A= >>>=0A= >>> In it, there is just a bunch of RST followed by SYN sent=0A= >>> from client->FreeBSD and FreeBSD just keeps sending=0A= >>> acks for the old segment back.=0A= >>> --> It looks like FreeBSD did the "RST, ACK" after the=0A= >>> krpc did a soshutdown(..SHUT_WR) on the socket,=0A= >>> for the one you've been looking at.=0A= >>> I'll test some more...=0A= >>>=0A= >>>> I would like to understand why the reestablishment of the connection= =0A= >>>> did not work...=0A= >>> It is looking like it takes either a non-empty send-q or a=0A= >>> soshutdown(..SHUT_WR) to get the FreeBSD socket=0A= >>> out of established, where it just ignores the RSTs and=0A= >>> SYN packets.=0A= >>>=0A= >>> Thanks for looking at it, rick=0A= >>>=0A= >>> Best regards=0A= >>> Michael=0A= >>>>=0A= >>>> Have fun with it, rick=0A= >>>>=0A= >>>>=0A= >>>> ________________________________________=0A= >>>> From: tuexen@freebsd.org =0A= >>>> Sent: Sunday, April 4, 2021 12:41 PM=0A= >>>> To: Rick Macklem=0A= >>>> Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org= =0A= >>>> Subject: Re: NFS Mount Hangs=0A= >>>>=0A= >>>> CAUTION: This email originated from outside of the University of Guelp= h. Do not click links or open attachments unless you recognize the sender a= nd know the content is safe. If in doubt, forward suspicious emails to IThe= lp@uoguelph.ca=0A= >>>>=0A= >>>>=0A= >>>>> On 4. Apr 2021, at 17:27, Rick Macklem wrote:= =0A= >>>>>=0A= >>>>> Well, I'm going to cheat and top post, since this is elated info. and= =0A= >>>>> not really part of the discussion...=0A= >>>>>=0A= >>>>> I've been testing network partitioning between a Linux client (5.2 ke= rnel)=0A= >>>>> and a FreeBSD-current NFS server. I have not gotten a solid hang, but= =0A= >>>>> I have had the Linux client doing "battle" with the FreeBSD server fo= r=0A= >>>>> several minutes after un-partitioning the connection.=0A= >>>>>=0A= >>>>> The battle basically consists of the Linux client sending an RST, fol= lowed=0A= >>>>> by a SYN.=0A= >>>>> The FreeBSD server ignores the RST and just replies with the same old= ack.=0A= >>>>> --> This varies from "just a SYN" that succeeds to 100+ cycles of the= above=0A= >>>>> over several minutes.=0A= >>>>>=0A= >>>>> I had thought that an RST was a "pretty heavy hammer", but FreeBSD se= ems=0A= >>>>> pretty good at ignoring it.=0A= >>>>>=0A= >>>>> A full packet capture of one of these is in /home/rmacklem/linuxtofre= enfs.pcap=0A= >>>>> in case anyone wants to look at it.=0A= >>>> On freefall? I would like to take a look at it...=0A= >>>>=0A= >>>> Best regards=0A= >>>> Michael=0A= >>>>>=0A= >>>>> Here's a tcpdump snippet of the interesting part (see the *** comment= s):=0A= >>>>> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [P.], seq 202585:202749, ack 212293, win 29128, options [n= op,nop,TS val 2073636037 ecr 2671204825], length 164: NFS reply xid 6131536= 85 reply ok 160 getattr NON 4 ids 0/33554432 sz 0=0A= >>>>> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [.], ack 202749, win 501, options [nop,nop,TS val 26712048= 25 ecr 2073636037], length 0=0A= >>>>> *** Network is now partitioned...=0A= >>>>>=0A= >>>>> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop= ,nop,TS val 2671204927 ecr 2073636037], length 232: NFS request xid 6299309= 01 228 getattr fh 0,1/53=0A= >>>>> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop= ,nop,TS val 2671205135 ecr 2073636037], length 232: NFS request xid 6299309= 01 228 getattr fh 0,1/53=0A= >>>>> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, win 501, options [nop= ,nop,TS val 2671205343 ecr 2073636037], length 232: NFS request xid 6299309= 01 228 getattr fh 0,1/53=0A= >>>>> *** Lots of lines snipped.=0A= >>>>>=0A= >>>>>=0A= >>>>> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-= linux.home.rick, length 28=0A= >>>>> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-= linux.home.rick, length 28=0A= >>>>> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-= linux.home.rick, length 28=0A= >>>>> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-= linux.home.rick, length 28=0A= >>>>> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-= linux.home.rick, length 28=0A= >>>>> *** Network is now unpartitioned...=0A= >>>>>=0A= >>>>> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:= 72 (oui Unknown), length 46=0A= >>>>> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS= val 2671421871 ecr 0,nop,wscale 7], length 0=0A= >>>>> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 207385= 5137 ecr 2671204825], length 0=0A= >>>>> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >>>>> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS= val 2671424943 ecr 0,nop,wscale 7], length 0=0A= >>>>> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 207385= 6161 ecr 2671204825], length 0=0A= >>>>> *** This "battle" goes on for 223sec...=0A= >>>>> I snipped out 13 cycles of this "Linux sends an RST, followed by SYN"= =0A= >>>>> "FreeBSD replies with same old ACK". In another test run I saw this= =0A= >>>>> cycle continue non-stop for several minutes. This time, the Linux=0A= >>>>> client paused for a while (see ARPs below).=0A= >>>>>=0A= >>>>> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [R], seq 964161458, win 0, length 0=0A= >>>>> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [S], seq 416692300, win 64240, options [mss 1460,sackOK,TS= val 2671428975 ecr 0,nop,wscale 7], length 0=0A= >>>>> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [.], ack 212293, win 29127, options [nop,nop,TS val 207386= 0193 ecr 2671204825], length 0=0A= >>>>> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST.=0A= >>>>>=0A= >>>>> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell nfsv4-= linux.home.rick, length 28=0A= >>>>> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at d4:be:d9:07:81:= 72 (oui Unknown), length 46=0A= >>>>> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell 192.16= 8.1.254, length 46=0A= >>>>> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell 192.16= 8.1.254, length 46=0A= >>>>> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,= TS val 2074046846 ecr 2671204825], length 0=0A= >>>>> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [F.], seq 202749, ack 212293, win 29128, options [nop,nop,= TS val 2074062406 ecr 2671204825], length 0=0A= >>>>> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [R.], seq 202750, ack 212293, win 0, options [nop,nop,TS v= al 2074077981 ecr 2671204825], length 0=0A= >>>>> *** FreeBSD finally acknowledges the RST 38sec after Linux sent the l= ast=0A= >>>>> of 13 (100+ for another test run).=0A= >>>>>=0A= >>>>> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [S], seq 4247692373, win 64240, options [mss 1460,sackOK,T= S val 2671667055 ecr 0,nop,wscale 7], length 0=0A= >>>>> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [S.], seq 661237469, ack 4247692374, win 65535, options [m= ss 1460,nop,wscale 6,sackOK,TS val 2074098278 ecr 2671667055], length 0=0A= >>>>> *** Now back in business...=0A= >>>>>=0A= >>>>> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [.], ack 1, win 502, options [nop,nop,TS val 2671667055 ec= r 2074098278], length 0=0A= >>>>> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, options [nop,nop,TS val 2= 671667056 ecr 2074098278], length 232: NFS request xid 629930901 228 getatt= r fh 0,1/53=0A= >>>>> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, options [nop,nop,TS val= 2671667056 ecr 2074098278], length 272: NFS request xid 697039765 132 geta= ttr fh 0,1/53=0A= >>>>> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [.], ack 505, win 29128, options [nop,nop,TS val 207409827= 9 ecr 2671667056], length 0=0A= >>>>> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > nfsv4-new3.home.= rick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, options [nop,nop,TS val= 2671667056 ecr 2074098279], length 136: NFS request xid 730594197 132 geta= ttr fh 0,1/53=0A= >>>>> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > nfsv4-linux.home.rick.= apex-mesh: Flags [P.], seq 1:49, ack 505, win 29128, options [nop,nop,TS va= l 2074098279 ecr 2671667056], length 48: NFS reply xid 697039765 reply ok 4= 4 getattr ERROR: unk 10063=0A= >>>>>=0A= >>>>> This error 10063 after the partition heals is also "bad news". It ind= icates the Session=0A= >>>>> (which is supposed to maintain "exactly once" RPC semantics is broken= ). I'll admit I=0A= >>>>> suspect a Linux client bug, but will be investigating further.=0A= >>>>>=0A= >>>>> So, hopefully TCP conversant folk can confirm if the above is correct= behaviour=0A= >>>>> or if the RST should be ack'd sooner?=0A= >>>>>=0A= >>>>> I could also see this becoming a "forever" TCP battle for other versi= ons of Linux client.=0A= >>>>>=0A= >>>>> rick=0A= >>>>>=0A= >>>>>=0A= >>>>> ________________________________________=0A= >>>>> From: Scheffenegger, Richard =0A= >>>>> Sent: Sunday, April 4, 2021 7:50 AM=0A= >>>>> To: Rick Macklem; tuexen@freebsd.org=0A= >>>>> Cc: Youssef GHORBAL; freebsd-net@freebsd.org=0A= >>>>> Subject: Re: NFS Mount Hangs=0A= >>>>>=0A= >>>>> CAUTION: This email originated from outside of the University of Guel= ph. Do not click links or open attachments unless you recognize the sender = and know the content is safe. If in doubt, forward suspicious emails to ITh= elp@uoguelph.ca=0A= >>>>>=0A= >>>>>=0A= >>>>> For what it=91s worth, suse found two bugs in the linux nfconntrack (= stateful firewall), and pfifo-fast scheduler, which could conspire to make = tcp sessions hang forever.=0A= >>>>>=0A= >>>>> One is a missed updaten when the c=F6ient is not using the noresvport= moint option, which makes tje firewall think rsts are illegal (and drop th= em);=0A= >>>>>=0A= >>>>> The fast scheduler can run into an issue if only a single packet shou= ld be forwarded (note that this is not the default scheduler, but often rec= ommended for perf, as it runs lockless and lower cpu cost that pfq (default= ). If no other/additional packet pushes out that last packet of a flow, it = can become stuck forever...=0A= >>>>>=0A= >>>>> I can try getting the relevant bug info next week...=0A= >>>>>=0A= >>>>> ________________________________=0A= >>>>> Von: owner-freebsd-net@freebsd.org im= Auftrag von Rick Macklem =0A= >>>>> Gesendet: Friday, April 2, 2021 11:31:01 PM=0A= >>>>> An: tuexen@freebsd.org =0A= >>>>> Cc: Youssef GHORBAL ; freebsd-net@freebsd= .org =0A= >>>>> Betreff: Re: NFS Mount Hangs=0A= >>>>>=0A= >>>>> NetApp Security WARNING: This is an external email. Do not click link= s or open attachments unless you recognize the sender and know the content = is safe.=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>>=0A= >>>>> tuexen@freebsd.org wrote:=0A= >>>>>>> On 2. Apr 2021, at 02:07, Rick Macklem wrote= :=0A= >>>>>>>=0A= >>>>>>> I hope you don't mind a top post...=0A= >>>>>>> I've been testing network partitioning between the only Linux clien= t=0A= >>>>>>> I have (5.2 kernel) and a FreeBSD server with the xprtdied.patch=0A= >>>>>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken)=0A= >>>>>>> applied to it.=0A= >>>>>>>=0A= >>>>>>> I'm not enough of a TCP guy to know if this is useful, but here's w= hat=0A= >>>>>>> I see...=0A= >>>>>>>=0A= >>>>>>> While partitioned:=0A= >>>>>>> On the FreeBSD server end, the socket either goes to CLOSED during= =0A= >>>>>>> the network partition or stays ESTABLISHED.=0A= >>>>>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer als= o=0A= >>>>>> sent a FIN, but you never called close() on the socket.=0A= >>>>>> If the socket stays in ESTABLISHED, there is no communication ongoin= g,=0A= >>>>>> I guess, and therefore the server does not even detect that the peer= =0A= >>>>>> is not reachable.=0A= >>>>>>> On the Linux end, the socket seems to remain ESTABLISHED for a=0A= >>>>>>> little while, and then disappears.=0A= >>>>>> So how does Linux detect the peer is not reachable?=0A= >>>>> Well, here's what I see in a packet capture in the Linux client once= =0A= >>>>> I partition it (just unplug the net cable):=0A= >>>>> - lots of retransmits of the same segment (with ACK) for 54sec=0A= >>>>> - then only ARP queries=0A= >>>>>=0A= >>>>> Once I plug the net cable back in:=0A= >>>>> - ARP works=0A= >>>>> - one more retransmit of the same segement=0A= >>>>> - receives RST from FreeBSD=0A= >>>>> ** So, is this now a "new" TCP connection, despite=0A= >>>>> using the same port#.=0A= >>>>> --> It matters for NFS, since "new connection"=0A= >>>>> implies "must retry all outstanding RPCs".=0A= >>>>> - sends SYN=0A= >>>>> - receives SYN, ACK from FreeBSD=0A= >>>>> --> connection starts working again=0A= >>>>> Always uses same port#.=0A= >>>>>=0A= >>>>> On the FreeBSD server end:=0A= >>>>> - receives the last retransmit of the segment (with ACK)=0A= >>>>> - sends RST=0A= >>>>> - receives SYN=0A= >>>>> - sends SYN, ACK=0A= >>>>>=0A= >>>>> I thought that there was no RST in the capture I looked at=0A= >>>>> yesterday, so I'm not sure if FreeBSD always sends an RST,=0A= >>>>> but the Linux client behaviour was the same. (Sent a SYN, etc).=0A= >>>>> The socket disappears from the Linux "netstat -a" and I=0A= >>>>> suspect that happens after about 54sec, but I am not sure=0A= >>>>> about the timing.=0A= >>>>>=0A= >>>>>>>=0A= >>>>>>> After unpartitioning:=0A= >>>>>>> On the FreeBSD server end, you get another socket showing up at=0A= >>>>>>> the same port#=0A= >>>>>>> Active Internet connections (including servers)=0A= >>>>>>> Proto Recv-Q Send-Q Local Address Foreign Address (= state)=0A= >>>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 E= STABLISHED=0A= >>>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 C= LOSED=0A= >>>>>>>=0A= >>>>>>> The Linux client shows the same connection ESTABLISHED.=0A= >>>>> But disappears from "netstat -a" for a while during the partitioning.= =0A= >>>>>=0A= >>>>>>> (The mount sometimes reports an error. I haven't looked at packet= =0A= >>>>>>> traces to see if it retries RPCs or why the errors occur.)=0A= >>>>> I have now done so, as above.=0A= >>>>>=0A= >>>>>>> --> However I never get hangs.=0A= >>>>>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD server=0A= >>>>>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the=0A= >>>>>>> mount starts working again.=0A= >>>>>>>=0A= >>>>>>> The most obvious thing is that the Linux client always keeps using= =0A= >>>>>>> the same port#. (The FreeBSD client will use a different port# when= =0A= >>>>>>> it does a TCP reconnect after no response from the NFS server for= =0A= >>>>>>> a little while.)=0A= >>>>>>>=0A= >>>>>>> What do those TCP conversant think?=0A= >>>>>> I guess you are you are never calling close() on the socket, for wit= h=0A= >>>>>> the connection state is CLOSED.=0A= >>>>> Ok, that makes sense. For this case the Linux client has not done a= =0A= >>>>> BindConnectionToSession to re-assign the back channel.=0A= >>>>> I'll have to bug them about this. However, I'll bet they'll answer=0A= >>>>> that I have to tell them the back channel needs re-assignment=0A= >>>>> or something like that.=0A= >>>>>=0A= >>>>> I am pretty certain they are broken, in that the client needs to=0A= >>>>> retry all outstanding RPCs.=0A= >>>>>=0A= >>>>> For others, here's the long winded version of this that I just=0A= >>>>> put on the phabricator review:=0A= >>>>> In the server side kernel RPC, the socket (struct socket *) is in a= =0A= >>>>> structure called SVCXPRT (normally pointed to by "xprt").=0A= >>>>> These structures a ref counted and the soclose() is done=0A= >>>>> when the ref. cnt goes to zero. My understanding is that=0A= >>>>> "struct socket *" is free'd by soclose() so this cannot be done=0A= >>>>> before the xprt ref. cnt goes to zero.=0A= >>>>>=0A= >>>>> For NFSv4.1/4.2 there is something called a back channel=0A= >>>>> which means that a "xprt" is used for server->client RPCs,=0A= >>>>> although the TCP connection is established by the client=0A= >>>>> to the server.=0A= >>>>> --> This back channel holds a ref cnt on "xprt" until the=0A= >>>>>=0A= >>>>> client re-assigns it to a different TCP connection=0A= >>>>> via an operation called BindConnectionToSession=0A= >>>>> and the Linux client is not doing this soon enough,=0A= >>>>> it appears.=0A= >>>>>=0A= >>>>> So, the soclose() is delayed, which is why I think the=0A= >>>>> TCP connection gets stuck in CLOSE_WAIT and that is=0A= >>>>> why I've added the soshutdown(..SHUT_WR) calls,=0A= >>>>> which can happen before the client gets around to=0A= >>>>> re-assigning the back channel.=0A= >>>>>=0A= >>>>> Thanks for your help with this Michael, rick=0A= >>>>>=0A= >>>>> Best regards=0A= >>>>> Michael=0A= >>>>>>=0A= >>>>>> rick=0A= >>>>>> ps: I can capture packets while doing this, if anyone has a use=0A= >>>>>> for them.=0A= >>>>>>=0A= >>>>>>=0A= >>>>>>=0A= >>>>>>=0A= >>>>>>=0A= >>>>>>=0A= >>>>>> ________________________________________=0A= >>>>>> From: owner-freebsd-net@freebsd.org = on behalf of Youssef GHORBAL =0A= >>>>>> Sent: Saturday, March 27, 2021 6:57 PM=0A= >>>>>> To: Jason Breitman=0A= >>>>>> Cc: Rick Macklem; freebsd-net@freebsd.org=0A= >>>>>> Subject: Re: NFS Mount Hangs=0A= >>>>>>=0A= >>>>>> CAUTION: This email originated from outside of the University of Gue= lph. Do not click links or open attachments unless you recognize the sender= and know the content is safe. If in doubt, forward suspicious emails to IT= help@uoguelph.ca=0A= >>>>>>=0A= >>>>>>=0A= >>>>>>=0A= >>>>>>=0A= >>>>>> On 27 Mar 2021, at 13:20, Jason Breitman > wrote:=0A= >>>>>>=0A= >>>>>> The issue happened again so we can say that disabling TSO and LRO on= the NIC did not resolve this issue.=0A= >>>>>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhw= tso=0A= >>>>>> # ifconfig lagg0=0A= >>>>>> lagg0: flags=3D8943 = metric 0 mtu 1500=0A= >>>>>> options=3D8100b8=0A= >>>>>>=0A= >>>>>> We can also say that the sysctl settings did not resolve this issue.= =0A= >>>>>>=0A= >>>>>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1=0A= >>>>>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1=0A= >>>>>>=0A= >>>>>> # sysctl net.inet.tcp.finwait2_timeout=3D1000=0A= >>>>>> net.inet.tcp.finwait2_timeout: 60000 -> 1000=0A= >>>>>>=0A= >>>>>> I don=92t think those will do anything in your case since the FIN_WA= IT2 are on the client side and those sysctls are for BSD.=0A= >>>>>> By the way it seems that Linux recycles automatically TCP sessions i= n FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout)=0A= >>>>>>=0A= >>>>>> tcp_fin_timeout (integer; default: 60; since Linux 2.2)=0A= >>>>>> This specifies how many seconds to wait for a final FIN=0A= >>>>>> packet before the socket is forcibly closed. This is=0A= >>>>>> strictly a violation of the TCP specification, but=0A= >>>>>> required to prevent denial-of-service attacks. In Linux=0A= >>>>>> 2.2, the default value was 180.=0A= >>>>>>=0A= >>>>>> So I don=92t get why it stucks in the FIN_WAIT2 state anyway.=0A= >>>>>>=0A= >>>>>> You really need to have a packet capture during the outage (client a= nd server side) so you=92ll get over the wire chat and start speculating fr= om there.=0A= >>>>>> No need to capture the beginning of the outage for now. All you have= to do, is run a tcpdump for 10 minutes or so when you notice a client stuc= k.=0A= >>>>>>=0A= >>>>>> * I have not rebooted the NFS Server nor have I restarted nfsd, but = do not believe that is required as these settings are at the TCP level and = I would expect new sessions to use the updated settings.=0A= >>>>>>=0A= >>>>>> The issue occurred after 5 days following a reboot of the client mac= hines.=0A= >>>>>> I ran the capture information again to make use of the situation.=0A= >>>>>>=0A= >>>>>> #!/bin/sh=0A= >>>>>>=0A= >>>>>> while true=0A= >>>>>> do=0A= >>>>>> /bin/date >> /tmp/nfs-hang.log=0A= >>>>>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log=0A= >>>>>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log=0A= >>>>>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log=0A= >>>>>> /bin/sleep 60=0A= >>>>>> done=0A= >>>>>>=0A= >>>>>>=0A= >>>>>> On the NFS Server=0A= >>>>>> Active Internet connections (including servers)=0A= >>>>>> Proto Recv-Q Send-Q Local Address Foreign Address (s= tate)=0A= >>>>>> tcp4 0 0 NFS.Server.IP.X.2049 NFS.Client.IP.X.48286 = CLOSE_WAIT=0A= >>>>>>=0A= >>>>>> On the NFS Client=0A= >>>>>> tcp 0 0 NFS.Client.IP.X:48286 NFS.Server.IP.X:2049 = FIN_WAIT2=0A= >>>>>>=0A= >>>>>>=0A= >>>>>>=0A= >>>>>> You had also asked for the output below.=0A= >>>>>>=0A= >>>>>> # nfsstat -E -s=0A= >>>>>> BackChannelCtBindConnToSes=0A= >>>>>> 0 0=0A= >>>>>>=0A= >>>>>> # sysctl vfs.nfsd.request_space_throttle_count=0A= >>>>>> vfs.nfsd.request_space_throttle_count: 0=0A= >>>>>>=0A= >>>>>> I see that you are testing a patch and I look forward to seeing the = results.=0A= >>>>>>=0A= >>>>>>=0A= >>>>>> Jason Breitman=0A= >>>>>>=0A= >>>>>>=0A= >>>>>> On Mar 21, 2021, at 6:21 PM, Rick Macklem > wrote:=0A= >>>>>>=0A= >>>>>> Youssef GHORBAL > wrote:=0A= >>>>>>> Hi Jason,=0A= >>>>>>>=0A= >>>>>>>> On 17 Mar 2021, at 18:17, Jason Breitman > wrote:=0A= >>>>>>>>=0A= >>>>>>>> Please review the details below and let me know if there is a sett= ing that I should apply to my FreeBSD NFS Server or if there is a bug fix t= hat I can apply to resolve my issue.=0A= >>>>>>>> I shared this information with the linux-nfs mailing list and they= believe the issue is on the server side.=0A= >>>>>>>>=0A= >>>>>>>> Issue=0A= >>>>>>>> NFSv4 mounts periodically hang on the NFS Client.=0A= >>>>>>>>=0A= >>>>>>>> During this time, it is possible to manually mount from another NF= S Server on the NFS Client having issues.=0A= >>>>>>>> Also, other NFS Clients are successfully mounting from the NFS Ser= ver in question.=0A= >>>>>>>> Rebooting the NFS Client appears to be the only solution.=0A= >>>>>>>=0A= >>>>>>> I had experienced a similar weird situation with periodically stuck= Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD based bu= t they seem to have there >own nfsd)=0A= >>>>>> Yes, my understanding is that Isilon uses a proprietary user space n= fsd and=0A= >>>>>> not the kernel based RPC and nfsd in FreeBSD.=0A= >>>>>>=0A= >>>>>>> We=92ve had better luck and we did manage to have packet captures o= n both sides >during the issue. The gist of it goes like follows:=0A= >>>>>>>=0A= >>>>>>> - Data flows correctly between SERVER and the CLIENT=0A= >>>>>>> - At some point SERVER starts decreasing it's TCP Receive Window un= til it reachs 0=0A= >>>>>>> - The client (eager to send data) can only ack data sent by SERVER.= =0A= >>>>>>> - When SERVER was done sending data, the client starts sending TCP = Window >Probes hoping that the TCP Window opens again so he can flush its b= uffers.=0A= >>>>>>> - SERVER responds with a TCP Zero Window to those probes.=0A= >>>>>> Having the window size drop to zero is not necessarily incorrect.=0A= >>>>>> If the server is overloaded (has a backlog of NFS requests), it can = stop doing=0A= >>>>>> soreceive() on the socket (so the socket rcv buffer can fill up and = the TCP window=0A= >>>>>> closes). This results in "backpressure" to stop the NFS client from = flooding the=0A= >>>>>> NFS server with requests.=0A= >>>>>> --> However, once the backlog is handled, the nfsd should start to s= oreceive()=0A= >>>>>> again and this shouls cause the window to open back up.=0A= >>>>>> --> Maybe this is broken in the socket/TCP code. I quickly got lost = in=0A= >>>>>> tcp_output() when it decides what to do about the rcvwin.=0A= >>>>>>=0A= >>>>>>> - After 6 minutes (the NFS server default Idle timeout) SERVER race= fully closes the >TCP connection sending a FIN Packet (and still a TCP Wind= ow 0)=0A= >>>>>> This probably does not happen for Jason's case, since the 6minute ti= meout=0A= >>>>>> is disabled when the TCP connection is assigned as a backchannel (mo= st likely=0A= >>>>>> the case for NFSv4.1).=0A= >>>>>>=0A= >>>>>>> - CLIENT ACK that FIN.=0A= >>>>>>> - SERVER goes in FIN_WAIT_2 state=0A= >>>>>>> - CLIENT closes its half part part of the socket and goes in LAST_A= CK state.=0A= >>>>>>> - FIN is never sent by the client since there still data in its Sen= dQ and receiver TCP >Window is still 0. At this stage the client starts sen= ding TCP Window Probes again >and again hoping that the server opens its TC= P Window so it can flush it's buffers >and terminate its side of the socket= .=0A= >>>>>>> - SERVER keeps responding with a TCP Zero Window to those probes.= =0A= >>>>>>> =3D> The last two steps goes on and on for hours/days freezing the = NFS mount bound >to that TCP session.=0A= >>>>>>>=0A= >>>>>>> If we had a situation where CLIENT was responsible for closing the = TCP Window (and >initiating the TCP FIN first) and server wanting to send d= ata we=92ll end up in the same >state as you I think.=0A= >>>>>>>=0A= >>>>>>> We=92ve never had the root cause of why the SERVER decided to close= the TCP >Window and no more acccept data, the fix on the Isilon part was t= o recycle more >aggressively the FIN_WAIT_2 sockets (net.inet.tcp.fast_finw= ait2_recycle=3D1 & >net.inet.tcp.finwait2_timeout=3D5000). Once the socket = recycled and at the next >occurence of CLIENT TCP Window probe, SERVER send= s a RST, triggering the >teardown of the session on the client side, a new = TCP handchake, etc and traffic >flows again (NFS starts responding)=0A= >>>>>>>=0A= >>>>>>> To avoid rebooting the client (and before the aggressive FIN_WAIT_2= was >implemented on the Isilon side) we=92ve added a check script on the c= lient that detects >LAST_ACK sockets on the client and through iptables rul= e enforces a TCP RST, >Something like: -A OUTPUT -p tcp -d $nfs_server_addr= --sport $local_port -j REJECT >--reject-with tcp-reset (the script removes= this iptables rule as soon as the LAST_ACK >disappears)=0A= >>>>>>>=0A= >>>>>>> The bottom line would be to have a packet capture during the outage= (client and/or >server side), it will show you at least the shape of the T= CP exchange when NFS is >stuck.=0A= >>>>>> Interesting story and good work w.r.t. sluething, Youssef, thanks.= =0A= >>>>>>=0A= >>>>>> I looked at Jason's log and it shows everything is ok w.r.t the nfsd= threads.=0A= >>>>>> (They're just waiting for RPC requests.)=0A= >>>>>> However, I do now think I know why the soclose() does not happen.=0A= >>>>>> When the TCP connection is assigned as a backchannel, that takes a r= eference=0A= >>>>>> cnt on the structure. This refcnt won't be released until the connec= tion is=0A= >>>>>> replaced by a BindConnectiotoSession operation from the client. But = that won't=0A= >>>>>> happen until the client creates a new TCP connection.=0A= >>>>>> --> No refcnt release-->no refcnt of 0-->no soclose().=0A= >>>>>>=0A= >>>>>> I've created the attached patch (completely different from the previ= ous one)=0A= >>>>>> that adds soshutdown(SHUT_WR) calls in the three places where the TC= P=0A= >>>>>> connection is going away. This seems to get it past CLOSE_WAIT witho= ut a=0A= >>>>>> soclose().=0A= >>>>>> --> I know you are not comfortable with patching your server, but I = do think=0A= >>>>>> this change will get the socket shutdown to complete.=0A= >>>>>>=0A= >>>>>> There are a couple more things you can check on the server...=0A= >>>>>> # nfsstat -E -s=0A= >>>>>> --> Look for the count under "BindConnToSes".=0A= >>>>>> --> If non-zero, backchannels have been assigned=0A= >>>>>> # sysctl -a | fgrep request_space_throttle_count=0A= >>>>>> --> If non-zero, the server has been overloaded at some point.=0A= >>>>>>=0A= >>>>>> I think the attached patch might work around the problem.=0A= >>>>>> The code that should open up the receive window needs to be checked.= =0A= >>>>>> I am also looking at enabling the 6minute timeout when a backchannel= is=0A= >>>>>> assigned.=0A= >>>>>>=0A= >>>>>> rick=0A= >>>>>>=0A= >>>>>> Youssef=0A= >>>>>>=0A= >>>>>> _______________________________________________=0A= >>>>>> freebsd-net@freebsd.org mailing list= =0A= >>>>>> https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listin= fo/freebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv= 5OZciLj4ObR_drJNv5yryaERfIbhKR2d$=0A= >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or= g"=0A= >>>>>> =0A= >>>>>>=0A= >>>>>> =0A= >>>>>>=0A= >>>>>> _______________________________________________=0A= >>>>>> freebsd-net@freebsd.org mailing list=0A= >>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or= g"=0A= >>>>>> _______________________________________________=0A= >>>>>> freebsd-net@freebsd.org mailing list=0A= >>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or= g"=0A= >>>>>=0A= >>>>> _______________________________________________=0A= >>>>> freebsd-net@freebsd.org mailing list=0A= >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= "=0A= >>>>> _______________________________________________=0A= >>>>> freebsd-net@freebsd.org mailing list=0A= >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= "=0A= >>>>=0A= >>>=0A= >>=0A= >> _______________________________________________=0A= >> freebsd-net@freebsd.org mailing list=0A= >> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >=0A= > _______________________________________________=0A= > freebsd-net@freebsd.org mailing list=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= =0A= From owner-freebsd-net@freebsd.org Sat Apr 10 15:19:46 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 37A015CF34D for ; Sat, 10 Apr 2021 15:19:46 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHdw161slz3hDP for ; Sat, 10 Apr 2021 15:19:45 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:1507:c609:f682:ea59] (unknown [IPv6:2a02:8109:1140:c3d:1507:c609:f682:ea59]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTPSA id DEAC67B157F76; Sat, 10 Apr 2021 17:19:33 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs From: tuexen@freebsd.org In-Reply-To: Date: Sat, 10 Apr 2021 17:19:33 +0200 Cc: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: <9EC22813-4B7B-4B52-84AA-B7E1DA6F81B7@freebsd.org> References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> <8B7C867D-54A5-4EFA-B5BC-CA63FFC1EA77@freebsd.org> To: Rick Macklem X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FHdw161slz3hDP X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [0.00 / 15.00]; local_wl_from(0.00)[freebsd.org]; ASN(0.00)[asn:680, ipnet:2001:638::/32, country:DE] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 15:19:46 -0000 > On 10. Apr 2021, at 17:04, Rick Macklem wrote: >=20 > tuexen@freebsd.org wrote: >>> On 10. Apr 2021, at 02:44, Rick Macklem = wrote: >>>=20 >>> tuexen@freebsd.org wrote: >>>>> On 6. Apr 2021, at 01:24, Rick Macklem = wrote: >>>>>=20 >>>>> tuexen@freebsd.org wrote: >>>>> [stuff snipped] >>>>>> OK. What is the FreeBSD version you are using? >>>>> main Dec. 23, 2020. >>>>>=20 >>>>>>=20 >>>>>> It seems that the TCP connection on the FreeBSD is still alive, >>>>>> Linux has decided to start a new TCP connection using the old >>>>>> port numbers. So it sends a SYN. The response is a challenge ACK >>>>>> and Linux responds with a RST. This looks good so far. However, >>>>>> FreeBSD should accept the RST and kill the TCP connection. The >>>>>> next SYN from the Linux side would establish a new TCP = connection. >>>>>>=20 >>>>>> So I'm wondering why the RST is not accepted. I made the = timestamp >>>>>> checking stricter but introduced a bug where RST segments without >>>>>> timestamps were ignored. This was fixed. >>>>>>=20 >>>>>> Introduced in main on 2020/11/09: >>>>>> https://svnweb.freebsd.org/changeset/base/367530 >>>>>> Introduced in stable/12 on 2020/11/30: >>>>>> https://svnweb.freebsd.org/changeset/base/36818 >>>>>>> Fix in main on 2021/01/13: >>>>>> = https://cgit.FreeBSD.org/src/commit/?id=3Dcc3c34859eab1b317d0f38731355b53f= 7d978c97 >>>>>> Fix in stable/12 on 2021/01/24: >>>>>> = https://cgit.FreeBSD.org/src/commit/?id=3Dd05d908d6d3c85479c84c707f9311484= 39ae826b >>>>>>=20 >>>>>> Are you using a version which is affected by this bug? >>>>> I was. Now I've applied the patch. >>>>> Bad News. It did not fix the problem. >>>>> It still gets into an endless "ignore RST" and stay established = when >>>>> the Send-Q is empty. >>>> OK. Let us focus on this case. >>>>=20 >>>> Could you: >>>> 1. sudo sysctl net.inet.tcp.log_debug=3D1 >>>> 2. repeat the situation where RSTs are ignored. >>>> 3. check if there is some output on the console = (/var/log/messages). >>>> 4. Either provide the output or let me know that there is none. >>> Well, I have some good news and some bad news (the bad is mostly for = Richard). >>> The only message logged is: >>> tcpflags 0x4; tcp_do_segment: Timestamp missing, segment = processed normally >>>=20 >>> But...the RST battle no longer occurs. Just one RST that works and = then >>> the SYN gets SYN,ACK'd by the FreeBSD end and off it goes... >> The above is what I would expect if you integrated = cc3c34859eab1b317d0f38731355b53f7d978c97 >> or reverted r367530. Did you do that? > r367530 is in the kernel that does not cause the "RST battle". >=20 >>=20 >>=20 >> So, what is different? >>=20 >> r367492 is reverted from the FreeBSD server. > Only that? So you still have the bug I introduced in tree, but the RST = segment is accepted? > No. The kernel being tested includes the fix (you committed = mid-January) for the bug > that went in in Nov. > However, adding the mid-January patch did not fix the problem. OK. I was focussing on the behaviour that FreeBSD does ignore the = received RST. That is fixed. Good. I understand that this does not solve your issue. > Then reverting r367492 (and only r367492) made the problem go away. >=20 > I did not expect reverting r367492 to affect this. > I reverted r367492 because otis@ gets Linux client mounts "stuck" = against a FreBSD13 > NFS server, where the Recv-Q size grows and the client gets no RPC = replies. Other > clients are still working fine. I can only think of one explanations = for this: > - An upcall gets missed or occurs at the wrong time. My understanding of the patch is that it "delays" the upcall to the end = of the packet processing. So the amount of time is at most a packet processing = time, which is short in my view. Richard: Correct me if I'm wrong. Best regards Michael > --> Since what this patch does is move where the upcalls is done, it = is the logical > culprit. > Hopefully otis@ will be able to determine if reverting r367492 = fixes the problem. > This will take weeks, since the problem recently took two weeks = to recur. > --> This would be the receive path, so reverting the send path = would not be > relevant. > *** I'd like to hear from otis@ before testing a "send path only" = revert. > --> Also, it has been a long time since I worked on the socket upcall = code, but I > vaguely remember that the upcalls needed to be done before = SOCKBUF_LOCK() > is dropped to ensure that the socket buffer is in the expected = state. > r367492 drops SOCKBUF_LOCK() and then picks it up again for the = upcalls. >=20 > I'll send you guys the otis@ problem email. (I don't think that one is = cc'd to a list. >=20 > rick >=20 > Best regards > Michael >> I did the revert because I think it might be what otis@ hang is being >> caused by. (In his case, the Recv-Q grows on the socket for the >> stuck Linux client, while others work. >>=20 >> Why does reverting fix this? >> My only guess is that the krpc gets the upcall right away and sees >> a EPIPE when it does soreceive()->results in soshutdown(SHUT_WR). >> I know from a printf that this happened, but whether it caused the >> RST battle to not happen, I don't know. >>=20 >> I can put r367492 back in and do more testing if you'd like, but >> I think it probably needs to be reverted? >>=20 >> This does not explain the original hung Linux client problem, >> but does shed light on the RST war I could create by doing a >> network partitioning. >>=20 >> rick >>=20 >> Best regards >> Michael >>>=20 >>> If the Send-Q is non-empty when I partition, it recovers fine, >>> sometimes not even needing to see an RST. >>>=20 >>> rick >>> ps: If you think there might be other recent changes that matter, >>> just say the word and I'll upgrade to bits de jur. >>>=20 >>> rick >>>=20 >>> Best regards >>> Michael >>>>=20 >>>> If I wait long enough before healing the partition, it will >>>> go to FIN_WAIT_1, and then if I plug it back in, it does not >>>> do battle (at least not for long). >>>>=20 >>>> Btw, I have one running now that seems stuck really good. >>>> It has been 20minutes since I plugged the net cable back in. >>>> (Unfortunately, I didn't have tcpdump running until after >>>> I saw it was not progressing after healing. >>>> --> There is one difference. There was a 6minute timeout >>>> enabled on the server krpc for "no activity", which is >>>> now disabled like it is for NFSv4.1 in freebsd-current. >>>> I had forgotten to re-disable it. >>>> So, when it does battle, it might have been the 6minute >>>> timeout, which would then do the soshutdown(..SHUT_WR) >>>> which kept it from getting "stuck" forever. >>>> -->This time I had to reboot the FreeBSD NFS server to >>>> get the Linux client unstuck, so this one looked a lot >>>> like what has been reported. >>>> The pcap for this one, started after the network was plugged >>>> back in and I noticed it was stuck for quite a while is here: >>>> fetch https://people.freebsd.org/~rmacklem/stuck.pcap >>>>=20 >>>> In it, there is just a bunch of RST followed by SYN sent >>>> from client->FreeBSD and FreeBSD just keeps sending >>>> acks for the old segment back. >>>> --> It looks like FreeBSD did the "RST, ACK" after the >>>> krpc did a soshutdown(..SHUT_WR) on the socket, >>>> for the one you've been looking at. >>>> I'll test some more... >>>>=20 >>>>> I would like to understand why the reestablishment of the = connection >>>>> did not work... >>>> It is looking like it takes either a non-empty send-q or a >>>> soshutdown(..SHUT_WR) to get the FreeBSD socket >>>> out of established, where it just ignores the RSTs and >>>> SYN packets. >>>>=20 >>>> Thanks for looking at it, rick >>>>=20 >>>> Best regards >>>> Michael >>>>>=20 >>>>> Have fun with it, rick >>>>>=20 >>>>>=20 >>>>> ________________________________________ >>>>> From: tuexen@freebsd.org >>>>> Sent: Sunday, April 4, 2021 12:41 PM >>>>> To: Rick Macklem >>>>> Cc: Scheffenegger, Richard; Youssef GHORBAL; = freebsd-net@freebsd.org >>>>> Subject: Re: NFS Mount Hangs >>>>>=20 >>>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>>=20 >>>>>=20 >>>>>> On 4. Apr 2021, at 17:27, Rick Macklem = wrote: >>>>>>=20 >>>>>> Well, I'm going to cheat and top post, since this is elated info. = and >>>>>> not really part of the discussion... >>>>>>=20 >>>>>> I've been testing network partitioning between a Linux client = (5.2 kernel) >>>>>> and a FreeBSD-current NFS server. I have not gotten a solid hang, = but >>>>>> I have had the Linux client doing "battle" with the FreeBSD = server for >>>>>> several minutes after un-partitioning the connection. >>>>>>=20 >>>>>> The battle basically consists of the Linux client sending an RST, = followed >>>>>> by a SYN. >>>>>> The FreeBSD server ignores the RST and just replies with the same = old ack. >>>>>> --> This varies from "just a SYN" that succeeds to 100+ cycles of = the above >>>>>> over several minutes. >>>>>>=20 >>>>>> I had thought that an RST was a "pretty heavy hammer", but = FreeBSD seems >>>>>> pretty good at ignoring it. >>>>>>=20 >>>>>> A full packet capture of one of these is in = /home/rmacklem/linuxtofreenfs.pcap >>>>>> in case anyone wants to look at it. >>>>> On freefall? I would like to take a look at it... >>>>>=20 >>>>> Best regards >>>>> Michael >>>>>>=20 >>>>>> Here's a tcpdump snippet of the interesting part (see the *** = comments): >>>>>> 19:10:09.305775 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 202585:202749, ack = 212293, win 29128, options [nop,nop,TS val 2073636037 ecr 2671204825], = length 164: NFS reply xid 613153685 reply ok 160 getattr NON 4 ids = 0/33554432 sz 0 >>>>>> 19:10:09.305850 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 202749, win 501, options = [nop,nop,TS val 2671204825 ecr 2073636037], length 0 >>>>>> *** Network is now partitioned... >>>>>>=20 >>>>>> 19:10:09.407840 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671204927 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>>>> 19:10:09.615779 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205135 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>>>> 19:10:09.823780 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 212293:212525, ack 202749, = win 501, options [nop,nop,TS val 2671205343 ecr 2073636037], length 232: = NFS request xid 629930901 228 getattr fh 0,1/53 >>>>>> *** Lots of lines snipped. >>>>>>=20 >>>>>>=20 >>>>>> 19:13:41.295783 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>>> 19:13:42.319767 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>>> 19:13:46.351966 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>>> 19:13:47.375790 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>>> 19:13:48.399786 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>>> *** Network is now unpartitioned... >>>>>>=20 >>>>>> 19:13:48.399990 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>>>>> 19:13:48.400002 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671421871 ecr 0,nop,wscale 7], length 0 >>>>>> 19:13:48.400185 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073855137 ecr 2671204825], length 0 >>>>>> 19:13:48.400273 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>>>>> 19:13:49.423833 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671424943 ecr 0,nop,wscale 7], length 0 >>>>>> 19:13:49.424056 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073856161 ecr 2671204825], length 0 >>>>>> *** This "battle" goes on for 223sec... >>>>>> I snipped out 13 cycles of this "Linux sends an RST, followed by = SYN" >>>>>> "FreeBSD replies with same old ACK". In another test run I saw = this >>>>>> cycle continue non-stop for several minutes. This time, the Linux >>>>>> client paused for a while (see ARPs below). >>>>>>=20 >>>>>> 19:13:49.424101 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [R], seq 964161458, win 0, length 0 >>>>>> 19:13:53.455867 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 416692300, win 64240, options = [mss 1460,sackOK,TS val 2671428975 ecr 0,nop,wscale 7], length 0 >>>>>> 19:13:53.455991 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 212293, win 29127, = options [nop,nop,TS val 2073860193 ecr 2671204825], length 0 >>>>>> *** Snipped a bunch of stuff out, mostly ARPs, plus one more RST. >>>>>>=20 >>>>>> 19:16:57.775780 ARP, Request who-has nfsv4-new3.home.rick tell = nfsv4-linux.home.rick, length 28 >>>>>> 19:16:57.775937 ARP, Reply nfsv4-new3.home.rick is-at = d4:be:d9:07:81:72 (oui Unknown), length 46 >>>>>> 19:16:57.980240 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>>>>> 19:16:58.555663 ARP, Request who-has nfsv4-new3.home.rick tell = 192.168.1.254, length 46 >>>>>> 19:17:00.104701 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074046846 ecr 2671204825], length 0 >>>>>> 19:17:15.664354 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [F.], seq 202749, ack 212293, win = 29128, options [nop,nop,TS val 2074062406 ecr 2671204825], length 0 >>>>>> 19:17:31.239246 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [R.], seq 202750, ack 212293, win = 0, options [nop,nop,TS val 2074077981 ecr 2671204825], length 0 >>>>>> *** FreeBSD finally acknowledges the RST 38sec after Linux sent = the last >>>>>> of 13 (100+ for another test run). >>>>>>=20 >>>>>> 19:17:51.535979 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [S], seq 4247692373, win 64240, options = [mss 1460,sackOK,TS val 2671667055 ecr 0,nop,wscale 7], length 0 >>>>>> 19:17:51.536130 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [S.], seq 661237469, ack = 4247692374, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val = 2074098278 ecr 2671667055], length 0 >>>>>> *** Now back in business... >>>>>>=20 >>>>>> 19:17:51.536218 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [.], ack 1, win 502, options = [nop,nop,TS val 2671667055 ecr 2074098278], length 0 >>>>>> 19:17:51.536295 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 1:233, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 232: NFS = request xid 629930901 228 getattr fh 0,1/53 >>>>>> 19:17:51.536346 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 233:505, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098278], length 272: NFS = request xid 697039765 132 getattr fh 0,1/53 >>>>>> 19:17:51.536515 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [.], ack 505, win 29128, options = [nop,nop,TS val 2074098279 ecr 2671667056], length 0 >>>>>> 19:17:51.536553 IP nfsv4-linux.home.rick.apex-mesh > = nfsv4-new3.home.rick.nfsd: Flags [P.], seq 505:641, ack 1, win 502, = options [nop,nop,TS val 2671667056 ecr 2074098279], length 136: NFS = request xid 730594197 132 getattr fh 0,1/53 >>>>>> 19:17:51.536562 IP nfsv4-new3.home.rick.nfsd > = nfsv4-linux.home.rick.apex-mesh: Flags [P.], seq 1:49, ack 505, win = 29128, options [nop,nop,TS val 2074098279 ecr 2671667056], length 48: = NFS reply xid 697039765 reply ok 44 getattr ERROR: unk 10063 >>>>>>=20 >>>>>> This error 10063 after the partition heals is also "bad news". It = indicates the Session >>>>>> (which is supposed to maintain "exactly once" RPC semantics is = broken). I'll admit I >>>>>> suspect a Linux client bug, but will be investigating further. >>>>>>=20 >>>>>> So, hopefully TCP conversant folk can confirm if the above is = correct behaviour >>>>>> or if the RST should be ack'd sooner? >>>>>>=20 >>>>>> I could also see this becoming a "forever" TCP battle for other = versions of Linux client. >>>>>>=20 >>>>>> rick >>>>>>=20 >>>>>>=20 >>>>>> ________________________________________ >>>>>> From: Scheffenegger, Richard >>>>>> Sent: Sunday, April 4, 2021 7:50 AM >>>>>> To: Rick Macklem; tuexen@freebsd.org >>>>>> Cc: Youssef GHORBAL; freebsd-net@freebsd.org >>>>>> Subject: Re: NFS Mount Hangs >>>>>>=20 >>>>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>>>=20 >>>>>>=20 >>>>>> For what it=E2=80=98s worth, suse found two bugs in the linux = nfconntrack (stateful firewall), and pfifo-fast scheduler, which could = conspire to make tcp sessions hang forever. >>>>>>=20 >>>>>> One is a missed updaten when the c=C3=B6ient is not using the = noresvport moint option, which makes tje firewall think rsts are illegal = (and drop them); >>>>>>=20 >>>>>> The fast scheduler can run into an issue if only a single packet = should be forwarded (note that this is not the default scheduler, but = often recommended for perf, as it runs lockless and lower cpu cost that = pfq (default). If no other/additional packet pushes out that last packet = of a flow, it can become stuck forever... >>>>>>=20 >>>>>> I can try getting the relevant bug info next week... >>>>>>=20 >>>>>> ________________________________ >>>>>> Von: owner-freebsd-net@freebsd.org = im Auftrag von Rick Macklem = >>>>>> Gesendet: Friday, April 2, 2021 11:31:01 PM >>>>>> An: tuexen@freebsd.org >>>>>> Cc: Youssef GHORBAL ; = freebsd-net@freebsd.org >>>>>> Betreff: Re: NFS Mount Hangs >>>>>>=20 >>>>>> NetApp Security WARNING: This is an external email. Do not click = links or open attachments unless you recognize the sender and know the = content is safe. >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> tuexen@freebsd.org wrote: >>>>>>>> On 2. Apr 2021, at 02:07, Rick Macklem = wrote: >>>>>>>>=20 >>>>>>>> I hope you don't mind a top post... >>>>>>>> I've been testing network partitioning between the only Linux = client >>>>>>>> I have (5.2 kernel) and a FreeBSD server with the = xprtdied.patch >>>>>>>> (does soshutdown(..SHUT_WR) when it knows the socket is broken) >>>>>>>> applied to it. >>>>>>>>=20 >>>>>>>> I'm not enough of a TCP guy to know if this is useful, but = here's what >>>>>>>> I see... >>>>>>>>=20 >>>>>>>> While partitioned: >>>>>>>> On the FreeBSD server end, the socket either goes to CLOSED = during >>>>>>>> the network partition or stays ESTABLISHED. >>>>>>> If it goes to CLOSED you called shutdown(, SHUT_WR) and the peer = also >>>>>>> sent a FIN, but you never called close() on the socket. >>>>>>> If the socket stays in ESTABLISHED, there is no communication = ongoing, >>>>>>> I guess, and therefore the server does not even detect that the = peer >>>>>>> is not reachable. >>>>>>>> On the Linux end, the socket seems to remain ESTABLISHED for a >>>>>>>> little while, and then disappears. >>>>>>> So how does Linux detect the peer is not reachable? >>>>>> Well, here's what I see in a packet capture in the Linux client = once >>>>>> I partition it (just unplug the net cable): >>>>>> - lots of retransmits of the same segment (with ACK) for 54sec >>>>>> - then only ARP queries >>>>>>=20 >>>>>> Once I plug the net cable back in: >>>>>> - ARP works >>>>>> - one more retransmit of the same segement >>>>>> - receives RST from FreeBSD >>>>>> ** So, is this now a "new" TCP connection, despite >>>>>> using the same port#. >>>>>> --> It matters for NFS, since "new connection" >>>>>> implies "must retry all outstanding RPCs". >>>>>> - sends SYN >>>>>> - receives SYN, ACK from FreeBSD >>>>>> --> connection starts working again >>>>>> Always uses same port#. >>>>>>=20 >>>>>> On the FreeBSD server end: >>>>>> - receives the last retransmit of the segment (with ACK) >>>>>> - sends RST >>>>>> - receives SYN >>>>>> - sends SYN, ACK >>>>>>=20 >>>>>> I thought that there was no RST in the capture I looked at >>>>>> yesterday, so I'm not sure if FreeBSD always sends an RST, >>>>>> but the Linux client behaviour was the same. (Sent a SYN, etc). >>>>>> The socket disappears from the Linux "netstat -a" and I >>>>>> suspect that happens after about 54sec, but I am not sure >>>>>> about the timing. >>>>>>=20 >>>>>>>>=20 >>>>>>>> After unpartitioning: >>>>>>>> On the FreeBSD server end, you get another socket showing up at >>>>>>>> the same port# >>>>>>>> Active Internet connections (including servers) >>>>>>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = ESTABLISHED >>>>>>>> tcp4 0 0 nfsv4-new3.nfsd nfsv4-linux.678 = CLOSED >>>>>>>>=20 >>>>>>>> The Linux client shows the same connection ESTABLISHED. >>>>>> But disappears from "netstat -a" for a while during the = partitioning. >>>>>>=20 >>>>>>>> (The mount sometimes reports an error. I haven't looked at = packet >>>>>>>> traces to see if it retries RPCs or why the errors occur.) >>>>>> I have now done so, as above. >>>>>>=20 >>>>>>>> --> However I never get hangs. >>>>>>>> Sometimes it goes to SYN_SENT for a while and the FreeBSD = server >>>>>>>> shows FIN_WAIT_1, but then both ends go to ESTABLISHED and the >>>>>>>> mount starts working again. >>>>>>>>=20 >>>>>>>> The most obvious thing is that the Linux client always keeps = using >>>>>>>> the same port#. (The FreeBSD client will use a different port# = when >>>>>>>> it does a TCP reconnect after no response from the NFS server = for >>>>>>>> a little while.) >>>>>>>>=20 >>>>>>>> What do those TCP conversant think? >>>>>>> I guess you are you are never calling close() on the socket, for = with >>>>>>> the connection state is CLOSED. >>>>>> Ok, that makes sense. For this case the Linux client has not done = a >>>>>> BindConnectionToSession to re-assign the back channel. >>>>>> I'll have to bug them about this. However, I'll bet they'll = answer >>>>>> that I have to tell them the back channel needs re-assignment >>>>>> or something like that. >>>>>>=20 >>>>>> I am pretty certain they are broken, in that the client needs to >>>>>> retry all outstanding RPCs. >>>>>>=20 >>>>>> For others, here's the long winded version of this that I just >>>>>> put on the phabricator review: >>>>>> In the server side kernel RPC, the socket (struct socket *) is in = a >>>>>> structure called SVCXPRT (normally pointed to by "xprt"). >>>>>> These structures a ref counted and the soclose() is done >>>>>> when the ref. cnt goes to zero. My understanding is that >>>>>> "struct socket *" is free'd by soclose() so this cannot be done >>>>>> before the xprt ref. cnt goes to zero. >>>>>>=20 >>>>>> For NFSv4.1/4.2 there is something called a back channel >>>>>> which means that a "xprt" is used for server->client RPCs, >>>>>> although the TCP connection is established by the client >>>>>> to the server. >>>>>> --> This back channel holds a ref cnt on "xprt" until the >>>>>>=20 >>>>>> client re-assigns it to a different TCP connection >>>>>> via an operation called BindConnectionToSession >>>>>> and the Linux client is not doing this soon enough, >>>>>> it appears. >>>>>>=20 >>>>>> So, the soclose() is delayed, which is why I think the >>>>>> TCP connection gets stuck in CLOSE_WAIT and that is >>>>>> why I've added the soshutdown(..SHUT_WR) calls, >>>>>> which can happen before the client gets around to >>>>>> re-assigning the back channel. >>>>>>=20 >>>>>> Thanks for your help with this Michael, rick >>>>>>=20 >>>>>> Best regards >>>>>> Michael >>>>>>>=20 >>>>>>> rick >>>>>>> ps: I can capture packets while doing this, if anyone has a use >>>>>>> for them. >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>> ________________________________________ >>>>>>> From: owner-freebsd-net@freebsd.org = on behalf of Youssef GHORBAL = >>>>>>> Sent: Saturday, March 27, 2021 6:57 PM >>>>>>> To: Jason Breitman >>>>>>> Cc: Rick Macklem; freebsd-net@freebsd.org >>>>>>> Subject: Re: NFS Mount Hangs >>>>>>>=20 >>>>>>> CAUTION: This email originated from outside of the University of = Guelph. Do not click links or open attachments unless you recognize the = sender and know the content is safe. If in doubt, forward suspicious = emails to IThelp@uoguelph.ca >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>> On 27 Mar 2021, at 13:20, Jason Breitman = > = wrote: >>>>>>>=20 >>>>>>> The issue happened again so we can say that disabling TSO and = LRO on the NIC did not resolve this issue. >>>>>>> # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso = -vlanhwtso >>>>>>> # ifconfig lagg0 >>>>>>> lagg0: = flags=3D8943 metric 0 = mtu 1500 >>>>>>> = options=3D8100b8 >>>>>>>=20 >>>>>>> We can also say that the sysctl settings did not resolve this = issue. >>>>>>>=20 >>>>>>> # sysctl net.inet.tcp.fast_finwait2_recycle=3D1 >>>>>>> net.inet.tcp.fast_finwait2_recycle: 0 -> 1 >>>>>>>=20 >>>>>>> # sysctl net.inet.tcp.finwait2_timeout=3D1000 >>>>>>> net.inet.tcp.finwait2_timeout: 60000 -> 1000 >>>>>>>=20 >>>>>>> I don=E2=80=99t think those will do anything in your case since = the FIN_WAIT2 are on the client side and those sysctls are for BSD. >>>>>>> By the way it seems that Linux recycles automatically TCP = sessions in FIN_WAIT2 after 60 seconds (sysctl net.ipv4.tcp_fin_timeout) >>>>>>>=20 >>>>>>> tcp_fin_timeout (integer; default: 60; since Linux 2.2) >>>>>>> This specifies how many seconds to wait for a final FIN >>>>>>> packet before the socket is forcibly closed. This is >>>>>>> strictly a violation of the TCP specification, but >>>>>>> required to prevent denial-of-service attacks. In Linux >>>>>>> 2.2, the default value was 180. >>>>>>>=20 >>>>>>> So I don=E2=80=99t get why it stucks in the FIN_WAIT2 state = anyway. >>>>>>>=20 >>>>>>> You really need to have a packet capture during the outage = (client and server side) so you=E2=80=99ll get over the wire chat and = start speculating from there. >>>>>>> No need to capture the beginning of the outage for now. All you = have to do, is run a tcpdump for 10 minutes or so when you notice a = client stuck. >>>>>>>=20 >>>>>>> * I have not rebooted the NFS Server nor have I restarted nfsd, = but do not believe that is required as these settings are at the TCP = level and I would expect new sessions to use the updated settings. >>>>>>>=20 >>>>>>> The issue occurred after 5 days following a reboot of the client = machines. >>>>>>> I ran the capture information again to make use of the = situation. >>>>>>>=20 >>>>>>> #!/bin/sh >>>>>>>=20 >>>>>>> while true >>>>>>> do >>>>>>> /bin/date >> /tmp/nfs-hang.log >>>>>>> /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log >>>>>>> /usr/bin/procstat -kk 2947 >> /tmp/nfs-hang.log >>>>>>> /usr/bin/procstat -kk 2944 >> /tmp/nfs-hang.log >>>>>>> /bin/sleep 60 >>>>>>> done >>>>>>>=20 >>>>>>>=20 >>>>>>> On the NFS Server >>>>>>> Active Internet connections (including servers) >>>>>>> Proto Recv-Q Send-Q Local Address Foreign Address = (state) >>>>>>> tcp4 0 0 NFS.Server.IP.X.2049 = NFS.Client.IP.X.48286 CLOSE_WAIT >>>>>>>=20 >>>>>>> On the NFS Client >>>>>>> tcp 0 0 NFS.Client.IP.X:48286 = NFS.Server.IP.X:2049 FIN_WAIT2 >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>> You had also asked for the output below. >>>>>>>=20 >>>>>>> # nfsstat -E -s >>>>>>> BackChannelCtBindConnToSes >>>>>>> 0 0 >>>>>>>=20 >>>>>>> # sysctl vfs.nfsd.request_space_throttle_count >>>>>>> vfs.nfsd.request_space_throttle_count: 0 >>>>>>>=20 >>>>>>> I see that you are testing a patch and I look forward to seeing = the results. >>>>>>>=20 >>>>>>>=20 >>>>>>> Jason Breitman >>>>>>>=20 >>>>>>>=20 >>>>>>> On Mar 21, 2021, at 6:21 PM, Rick Macklem = > wrote: >>>>>>>=20 >>>>>>> Youssef GHORBAL = > wrote: >>>>>>>> Hi Jason, >>>>>>>>=20 >>>>>>>>> On 17 Mar 2021, at 18:17, Jason Breitman = > = wrote: >>>>>>>>>=20 >>>>>>>>> Please review the details below and let me know if there is a = setting that I should apply to my FreeBSD NFS Server or if there is a = bug fix that I can apply to resolve my issue. >>>>>>>>> I shared this information with the linux-nfs mailing list and = they believe the issue is on the server side. >>>>>>>>>=20 >>>>>>>>> Issue >>>>>>>>> NFSv4 mounts periodically hang on the NFS Client. >>>>>>>>>=20 >>>>>>>>> During this time, it is possible to manually mount from = another NFS Server on the NFS Client having issues. >>>>>>>>> Also, other NFS Clients are successfully mounting from the NFS = Server in question. >>>>>>>>> Rebooting the NFS Client appears to be the only solution. >>>>>>>>=20 >>>>>>>> I had experienced a similar weird situation with periodically = stuck Linux NFS clients >mounting Isilon NFS servers (Isilon is FreeBSD = based but they seem to have there >own nfsd) >>>>>>> Yes, my understanding is that Isilon uses a proprietary user = space nfsd and >>>>>>> not the kernel based RPC and nfsd in FreeBSD. >>>>>>>=20 >>>>>>>> We=E2=80=99ve had better luck and we did manage to have packet = captures on both sides >during the issue. The gist of it goes like = follows: >>>>>>>>=20 >>>>>>>> - Data flows correctly between SERVER and the CLIENT >>>>>>>> - At some point SERVER starts decreasing it's TCP Receive = Window until it reachs 0 >>>>>>>> - The client (eager to send data) can only ack data sent by = SERVER. >>>>>>>> - When SERVER was done sending data, the client starts sending = TCP Window >Probes hoping that the TCP Window opens again so he can = flush its buffers. >>>>>>>> - SERVER responds with a TCP Zero Window to those probes. >>>>>>> Having the window size drop to zero is not necessarily = incorrect. >>>>>>> If the server is overloaded (has a backlog of NFS requests), it = can stop doing >>>>>>> soreceive() on the socket (so the socket rcv buffer can fill up = and the TCP window >>>>>>> closes). This results in "backpressure" to stop the NFS client = from flooding the >>>>>>> NFS server with requests. >>>>>>> --> However, once the backlog is handled, the nfsd should start = to soreceive() >>>>>>> again and this shouls cause the window to open back up. >>>>>>> --> Maybe this is broken in the socket/TCP code. I quickly got = lost in >>>>>>> tcp_output() when it decides what to do about the rcvwin. >>>>>>>=20 >>>>>>>> - After 6 minutes (the NFS server default Idle timeout) SERVER = racefully closes the >TCP connection sending a FIN Packet (and still a = TCP Window 0) >>>>>>> This probably does not happen for Jason's case, since the = 6minute timeout >>>>>>> is disabled when the TCP connection is assigned as a backchannel = (most likely >>>>>>> the case for NFSv4.1). >>>>>>>=20 >>>>>>>> - CLIENT ACK that FIN. >>>>>>>> - SERVER goes in FIN_WAIT_2 state >>>>>>>> - CLIENT closes its half part part of the socket and goes in = LAST_ACK state. >>>>>>>> - FIN is never sent by the client since there still data in its = SendQ and receiver TCP >Window is still 0. At this stage the client = starts sending TCP Window Probes again >and again hoping that the server = opens its TCP Window so it can flush it's buffers >and terminate its = side of the socket. >>>>>>>> - SERVER keeps responding with a TCP Zero Window to those = probes. >>>>>>>> =3D> The last two steps goes on and on for hours/days freezing = the NFS mount bound >to that TCP session. >>>>>>>>=20 >>>>>>>> If we had a situation where CLIENT was responsible for closing = the TCP Window (and >initiating the TCP FIN first) and server wanting to = send data we=E2=80=99ll end up in the same >state as you I think. >>>>>>>>=20 >>>>>>>> We=E2=80=99ve never had the root cause of why the SERVER = decided to close the TCP >Window and no more acccept data, the fix on = the Isilon part was to recycle more >aggressively the FIN_WAIT_2 sockets = (net.inet.tcp.fast_finwait2_recycle=3D1 & = >net.inet.tcp.finwait2_timeout=3D5000). Once the socket recycled and at = the next >occurence of CLIENT TCP Window probe, SERVER sends a RST, = triggering the >teardown of the session on the client side, a new TCP = handchake, etc and traffic >flows again (NFS starts responding) >>>>>>>>=20 >>>>>>>> To avoid rebooting the client (and before the aggressive = FIN_WAIT_2 was >implemented on the Isilon side) we=E2=80=99ve added a = check script on the client that detects >LAST_ACK sockets on the client = and through iptables rule enforces a TCP RST, >Something like: -A OUTPUT = -p tcp -d $nfs_server_addr --sport $local_port -j REJECT >--reject-with = tcp-reset (the script removes this iptables rule as soon as the LAST_ACK = >disappears) >>>>>>>>=20 >>>>>>>> The bottom line would be to have a packet capture during the = outage (client and/or >server side), it will show you at least the shape = of the TCP exchange when NFS is >stuck. >>>>>>> Interesting story and good work w.r.t. sluething, Youssef, = thanks. >>>>>>>=20 >>>>>>> I looked at Jason's log and it shows everything is ok w.r.t the = nfsd threads. >>>>>>> (They're just waiting for RPC requests.) >>>>>>> However, I do now think I know why the soclose() does not = happen. >>>>>>> When the TCP connection is assigned as a backchannel, that takes = a reference >>>>>>> cnt on the structure. This refcnt won't be released until the = connection is >>>>>>> replaced by a BindConnectiotoSession operation from the client. = But that won't >>>>>>> happen until the client creates a new TCP connection. >>>>>>> --> No refcnt release-->no refcnt of 0-->no soclose(). >>>>>>>=20 >>>>>>> I've created the attached patch (completely different from the = previous one) >>>>>>> that adds soshutdown(SHUT_WR) calls in the three places where = the TCP >>>>>>> connection is going away. This seems to get it past CLOSE_WAIT = without a >>>>>>> soclose(). >>>>>>> --> I know you are not comfortable with patching your server, = but I do think >>>>>>> this change will get the socket shutdown to complete. >>>>>>>=20 >>>>>>> There are a couple more things you can check on the server... >>>>>>> # nfsstat -E -s >>>>>>> --> Look for the count under "BindConnToSes". >>>>>>> --> If non-zero, backchannels have been assigned >>>>>>> # sysctl -a | fgrep request_space_throttle_count >>>>>>> --> If non-zero, the server has been overloaded at some point. >>>>>>>=20 >>>>>>> I think the attached patch might work around the problem. >>>>>>> The code that should open up the receive window needs to be = checked. >>>>>>> I am also looking at enabling the 6minute timeout when a = backchannel is >>>>>>> assigned. >>>>>>>=20 >>>>>>> rick >>>>>>>=20 >>>>>>> Youssef >>>>>>>=20 >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing = list >>>>>>> = https://urldefense.com/v3/__https://lists.freebsd.org/mailman/listinfo/fre= ebsd-net__;!!JFdNOqOXpB6UZW0!_c2MFNbir59GXudWPVdE5bNBm-qqjXeBuJ2UEmFv5OZci= Lj4ObR_drJNv5yryaERfIbhKR2d$ >>>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>>> >>>>>>>=20 >>>>>>> >>>>>>>=20 >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>>=20 >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>=20 >>>>=20 >>>=20 >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >=20 From owner-freebsd-net@freebsd.org Sat Apr 10 15:56:38 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 51C9B5D08C0 for ; Sat, 10 Apr 2021 15:56:38 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on0605.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::605]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHfkY1NzSz3kKJ; Sat, 10 Apr 2021 15:56:35 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iPQf0ZpFooHImAv3wEq1pccQmDtQQlw29ggIdaGL2MNuKWgMipj0nDu8cs5DQe3r8McqNt5x1OlNNyqlRXJUKBr1FCNP/7bt1tswvRPlassHsbLlkjDE8slzTX9QN6deyR6cMxF5nuIeqTHLGtosjTHBszp0tWIKYxbhKcNN0vrbJO6P+sbCZEubNleuh0DwddY1QvXB8vkfdXNX2MPI5qO4ANvavTiO5lXnp7ghebTGMofc9MbQi+xVK0aoAGq+4X3ZLxWmjS6gFbwpLuAQtl5hIMZXiiTo2FtPYIJSxVGUV0Wh6hAi8+CRDfjTppR0RR5Omq5fuIPDhPWkoAlyyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=etFhs0w9pzXC/bbg1qGjOVdh4XbhI8GSw65RtF8qqCU=; b=OV2RTQl6ilNLULrEQiSwrjvE00zLRzGK6GARysv2nJoBx27LnPx7GpK4716c9hXkT2wDQzRQw9HPq9tsd6EJvwXsHjCJEbRJL11Ii01FGUfMq79h88wYs9zpUEzJfzuShe3Z8aIj0GNAprgI7KkT5/lY61ec39JVQf27lF7wJpTG2fclS/uTaghAIrHDzMOdkytoQp0fc0+wyeh6uIyx8Xxro32OFeZNn6OlhVVH8LvBlHmBdCSVwIRUIVGqhE+P0u1jQKYIr/bZwj2i5ksvvdBbdG86S/8yE1d++gvfilNFEN8LAq42qRxq61/gewM7tpvZn2W+cDUhZ5aBfzkzHw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=etFhs0w9pzXC/bbg1qGjOVdh4XbhI8GSw65RtF8qqCU=; b=pMJE0W3MsU9KC6kgSWRCubsT0COiquswUlHHuEmeWzBC6k3qvBHdJ7ELjpp8DmA/gTa3dI3hI3z403WVhcgj0DJEJD7jeqSPkBtv/O7wP3R2G355DD2G2Wvxhhqu1abwwH+KJcgafGnzhgd94jhY9dqadJm8U7iEPUS2oe+rZC+jTGTs2vZDlrjQGeKyLVsrWgrQkMWjpFsaAxslYmM4SP43UXdUChOi4otAWU1A2L9magAYbQWlMCWW0iJ2DAvmAB1BStzXfQvdA5TwN3xPeWgBQjycMLxNUBgxmYg0PdrwqcEkbm2ipUYuw9SEGvTmPlwvHKTM338TsxNeSHON4g== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQBPR0101MB4241.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c01:8::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3999.32; Sat, 10 Apr 2021 15:56:33 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.035; Sat, 10 Apr 2021 15:56:33 +0000 From: Rick Macklem To: "Scheffenegger, Richard" , "tuexen@freebsd.org" CC: Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAG5EAgAA+DUKAAB1JAIAACEqkgAEOcgCAAI4UZoAAhYMAgAXXgNmAAJVDAIAAMi2AgAAnewCAAAnWOw== Date: Sat, 10 Apr 2021 15:56:33 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> , <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org>, In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 5cbdfe69-0c17-4060-d9b3-08d8fc3936f8 x-ms-traffictypediagnostic: YQBPR0101MB4241: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:8882; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: ApFixNxkUde6r7S2pH24IYBi3UinXUPzO4IJ7vqBsw0XqQ/qctbyK3BnhZYQjBbzlGDBPpNq1v3qSsC221cT35gdxiY0Lr01+R2SIGHh0LbM7pOFTcd8+7VzB9BZ2uvLbYNeV5sGeP3l6j7g8dyLU8xe0dT3WMTTk8CkF4aDbuttJPLM1COyaqOIiv8aHlb/uLD6oo/yiP4PmLQ+g8KC8BhyarCx6rNRloI1y7UylpaVZsv11r6dUs9igTm6BwVbvuXZFIxM+zDksSrmAeOq6e4EbmgMZdTKISBAH9azziItB7aBgVQv+Cu/sOiHSQEEXYCRG6lf27PrTPgrWYoD/5cjgP4iM+ESabXunl+Y+BPTJEqIMOMmUNPqW+L102APZskBvQXruHbBNwovQtt+GH/GLvu+S1pkr0syEBBQFAhdpC2burrqBgYtaXYdi/cUkmdKJsW2ZYs33Dpe8lZ+b6VeBGuffbYtWOV/81w47HuDl7OIxF2rFM6kmS4VqLcD88SDrrVkehLwvRrHvWiDhTMzqwjW3d/ACWA7vxLsGifV5QF9NqVaLiBLEem/hcXQB9N6eaQby4M1sfTQVIc475oLcu5nDW+UoifEfBtXjqkDcNKWfpoPI3Ofv3/d1L1gsE+j0SQQdmrrUQdZg1w0uwA+IxwtLz1/IuWO/l5kXPmlWRheswVNcJwyfMy6xDZ4 x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(396003)(39860400002)(346002)(366004)(136003)(376002)(7696005)(9686003)(2906002)(5660300002)(76116006)(64756008)(66446008)(7116003)(66476007)(38100700002)(66946007)(91956017)(33656002)(55016002)(966005)(8936002)(66556008)(71200400001)(8676002)(316002)(83380400001)(3480700007)(478600001)(54906003)(6506007)(110136005)(52536014)(4326008)(186003)(786003)(86362001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?oCrR1oRGsCEpBQi/rTNaoGz5MhRF5Qu41SB+lHsVoIJC1TxvZMLbp/5Y?= =?Windows-1252?Q?g4TAKXXMkor1jYdRQHgDKuJrepyLsQD7X86GsdIz5fX6Uo9rjfSyXbQP?= =?Windows-1252?Q?abzmNkD5Um4tj8V7vwFepnvI9qZcd+U9qcCIwXJCNZfVk3JE1RxSIALh?= =?Windows-1252?Q?7VjZ7Hsryx20QbzHLGIG4JlhAJCa99vyMT7HgZ17XOiODFHe40BwdL4D?= =?Windows-1252?Q?TfBaIWmpuvUjg8MNI+jXmntkxXPKUJcClfNNlFE/3V0I2moKBhEupTr2?= =?Windows-1252?Q?URIfUH9A+Tr/vrSU3w6CDqDo6P5ldGlXJwyZ6MLH9X4Z+p6Cj3Z7vfUs?= =?Windows-1252?Q?tVIgEsQADoCJTkwg4nbWG2TIgk9kGSsf0ZBElDx5zfpHlN8m3xvZ8Ynx?= =?Windows-1252?Q?H8GsRDmFnPBItff6pqYBBlHgUollK9FgznDwvrWLVmDguJfLN5auPgY9?= =?Windows-1252?Q?0fSY1wL5emZaBvOYagLg3R0GnugGUIXBmPwWQSpVY9lPvxDYGPyuJVTo?= =?Windows-1252?Q?sSQVmc5Q4Eowx8BCllU/4Az9578DOBxCF0T/coywq3rLbFRUqfgAQjvm?= =?Windows-1252?Q?jUuIY1+XOFI+3soTahfhms9GQlh8JCNzfP+aTsncmK25fc9q27bEUg/S?= =?Windows-1252?Q?imzweTTnqtbf6FFFZjCQeK8NE2sJ16NrAs5SchqzlCIZBDyee7EkYm6Z?= =?Windows-1252?Q?F+Lo3f2pA4MDV34HL+pWjxMAb2kkgTD1PMcrdR4qRd/JGDV+Qup8x6PA?= =?Windows-1252?Q?CE9u/rJMt/JZuapPZaYamDaSjgxJdTG6BhocVNUMbfyqND1Gzk9W2nov?= =?Windows-1252?Q?ABHtI3tUvdrVu+cBv/9CYliNY1YuKt2j7VkoYm7tp1Oq1H/j2mH1eHRj?= =?Windows-1252?Q?OHHPyi34q9x9imO3uPVYrCaYRxVTQ3f3dLoI8W4po/IB2dv1nssY7oGl?= =?Windows-1252?Q?r2f355bc1AYDtBq0B0usZ3TvWwFU0byma76oUL8qIsPd9vmuakKd+/aM?= =?Windows-1252?Q?4DU5YQqFJOyjh/vh7q1hFqaFCHtbNM2Dhx8POYFnArZLE1rIIAtQktZ6?= =?Windows-1252?Q?W6J69keQX2G1uk4GQWKXSAo2DF5SPMHdG6dOGJ13UyxxNidPfdfq7gwA?= =?Windows-1252?Q?TvTrZgd8/kcvtO8r58d8XvxoHKmuEJ3uYJ8tEDIJMjnvORQLG73SwVrV?= =?Windows-1252?Q?6/RSTs2AyF1Vu7W58jYy4zRWFf6y/KmjtKBsGDS20moNr7YTVMadi2pU?= =?Windows-1252?Q?9ZVq5Ne0EFvfiikcFR4U7iyroZhfYGp+pIbcIXt8qkhQIihpxYu7zU8K?= =?Windows-1252?Q?Kq1Guqo4s+NLmgS99qGuxUOB3fkxLvJ28JB8KfP1zW7y/ZQUmVwdvbu+?= =?Windows-1252?Q?KZWCr38ccHzKx9EpxGlmrGshY1MCLI00n/6ayrtJ9e9ecfHudUWVgVK+?= =?Windows-1252?Q?dYyFGuYCamEjR8eXPqGdNtYBvUYw8OaMPkSLBGKOKboRW6vq1L11dFL0?= =?Windows-1252?Q?Zph3+ht0?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 5cbdfe69-0c17-4060-d9b3-08d8fc3936f8 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2021 15:56:33.7329 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: BVYKCbOflxPlfAy/fL9QP0mNvrOzGMCp7TXmIp4APtDFeJOr67W+05KeNmZHCMCiLaiJNaxa8KqujRGTfa4Kmg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQBPR0101MB4241 X-Rspamd-Queue-Id: 4FHfkY1NzSz3kKJ X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=pMJE0W3M; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::605 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5d::605:from]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5d::605:from:127.0.2.255]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_SPAM_LONG(1.00)[1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 15:56:38 -0000 Scheffenegger, Richard wrote:=0A= >>Rick wrote:=0A= >> Hi Rick,=0A= >>=0A= >>> Well, I have some good news and some bad news (the bad is mostly for Ri= chard).=0A= >>>=0A= >>> The only message logged is:=0A= >>> tcpflags 0x4; tcp_do_segment: Timestamp missing, segment processed= normally=0A= >>>=0A= Btw, I did get one additional message during further testing (with r367492 = reverted):=0A= tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection a= ttempt aborted=0A= by remote endpoint=0A= =0A= This only happened once of several test cycles.=0A= =0A= >>> But...the RST battle no longer occurs. Just one RST that works and then= the SYN gets SYN,ACK'd by the FreeBSD end and off it goes...=0A= >>>=0A= >>> So, what is different?=0A= >>>=0A= >>> r367492 is reverted from the FreeBSD server.=0A= >>> I did the revert because I think it might be what otis@ hang is being c= aused by. (In his case, the Recv-Q grows on the socket for the stuck Linux = client, while others work.=0A= >>>=0A= >>> Why does reverting fix this?=0A= >>> My only guess is that the krpc gets the upcall right away and sees a EP= IPE when it does soreceive()->results in soshutdown(SHUT_WR).=0A= This was bogus and incorrect. The diagnostic printf() I saw was generated f= or the=0A= back channel, and that would have occurred after the socket was shut down.= =0A= =0A= >>=0A= >> With r367492 you don't get the upcall with the same error state? Or you = don't get an error on a write() call, when there should be one?=0A= If Send-Q is 0 when the network is partitioned, after healing, the krpc see= s no activity on=0A= the socket (until it acquires/processes an RPC it will not do a sosend()).= =0A= Without the 6minute timeout, the RST battle goes on "forever" (I've never a= ctually=0A= waited more than 30minutes, which is close enough to "forever" for me).=0A= --> With the 6minute timeout, the "battle" stops after 6minutes, when the t= imeout=0A= causes a soshutdown(..SHUT_WR) on the socket.=0A= (Since the soshutdown() patch is not yet in "main". I got comments, b= ut no "reviewed"=0A= on it, the 6minute timer won't help if enabled in main. The soclose(= ) won't happen=0A= for TCP connections with the back channel enabled, such as Linux 4.1= /4.2 ones.)=0A= =0A= If Send-Q is non-empty when the network is partitioned, the battle will not= happen.=0A= =0A= >=0A= >My understanding is that he needs this error indication when calling shutd= own().=0A= There are several ways the krpc notices that a TCP connection is no longer = functional.=0A= - An error return like EPIPE from either sosend() or soreceive().=0A= - A return of 0 from soreceive() with no data (normal EOF from other end).= =0A= - A 6minute timeout on the server end, when no activity has occurred on the= =0A= connection. This timer is currently disabled for NFSv4.1/4.2 mounts in "m= ain",=0A= but I enabled it for this testing, to stop the "RST battle goes on foreve= r"=0A= during testing. I am thinking of enabling it on "main", but this crude ba= ndaid=0A= shouldn't be thought of as a "fix for the RST battle".=0A= =0A= >>=0A= >> From what you describe, this is on writes, isn't it? (I'm asking, at the= original problem that was fixed with r367492, occurs in the read path (dra= ining of ths so_rcv buffer in the upcall right away, which subsequently inf= luences the ACK sent by the stack).=0A= >>=0A= >> I only added the so_snd buffer after some discussion, if the WAKESOR sho= uldn't have a symmetric equivalent on WAKESOW....=0A= >>=0A= >> Thus a partial backout (leaving the WAKESOR part inside, but reverting t= he WAKESOW part) would still fix my initial problem about erraneous DSACKs = (which can also lead to extremely poor performance with Linux clients), but= possible address this issue...=0A= >>=0A= >> Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 f= or the revert only on the so_snd upcall?=0A= Since the krpc only uses receive upcalls, I don't see how reverting the sen= d side would have=0A= any effect?=0A= =0A= >Since the release of 13.0 is almost done, can we try to fix the issue inst= ead of reverting the commit?=0A= I think it has already shipped broken.=0A= I don't know if an errata is possible, or if it will be broken until 13.1.= =0A= =0A= --> I am much more concerned with the otis@ stuck client problem than this = RST battle that only=0A= occurs after a network partitioning, especially if it is 13.0 specif= ic.=0A= I did this testing to try to reproduce Jason's stuck client (with co= nnection in CLOSE_WAIT)=0A= problem, which I failed to reproduce.=0A= =0A= rick=0A= =0A= Rs: agree, a good understanding where the interaction btwn stack, socket an= d in kernel tcp user breaks is needed;=0A= =0A= >=0A= > If this doesn't help, some major surgery will be necessary to prevent NFS= sessions with SACK enabled, to transmit DSACKs...=0A= =0A= My understanding is that the problem is related to getting a local error in= dication after=0A= receiving a RST segment too late or not at all.=0A= =0A= Rs: but the move of the upcall should not materially change that; i don=92t= have a pc here to see if any upcall actually happens on rst...=0A= =0A= Best regards=0A= Michael=0A= >=0A= >=0A= >> I know from a printf that this happened, but whether it caused the RST b= attle to not happen, I don't know.=0A= >>=0A= >> I can put r367492 back in and do more testing if you'd like, but I think= it probably needs to be reverted?=0A= >=0A= > Please, I don't quite understand why the exact timing of the upcall would= be that critical here...=0A= >=0A= > A comparison of the soxxx calls and errors between the "good" and the "ba= d" would be perfect. I don't know if this is easy to do though, as these ca= lls appear to be scattered all around the RPC / NFS source paths.=0A= >=0A= >> This does not explain the original hung Linux client problem, but does s= hed light on the RST war I could create by doing a network partitioning.=0A= >>=0A= >> rick=0A= >=0A= > _______________________________________________=0A= > freebsd-net@freebsd.org mailing list=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= =0A= From owner-freebsd-net@freebsd.org Sat Apr 10 16:12:44 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B33605D0EEC for ; Sat, 10 Apr 2021 16:12:44 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHg583wgHz3kdm for ; Sat, 10 Apr 2021 16:12:44 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from [IPv6:2a02:8109:1140:c3d:1507:c609:f682:ea59] (unknown [IPv6:2a02:8109:1140:c3d:1507:c609:f682:ea59]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTPSA id EB9F87058918C; Sat, 10 Apr 2021 18:12:40 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: NFS Mount Hangs From: tuexen@freebsd.org In-Reply-To: Date: Sat, 10 Apr 2021 18:12:40 +0200 Cc: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: <3980F368-098D-4EE4-B213-4113C2CAFE7D@freebsd.org> References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org> To: Rick Macklem X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Rspamd-Queue-Id: 4FHg583wgHz3kdm X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [0.00 / 15.00]; ASN(0.00)[asn:680, ipnet:2001:638::/32, country:DE]; local_wl_from(0.00)[freebsd.org] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 16:12:44 -0000 > On 10. Apr 2021, at 17:56, Rick Macklem wrote: >=20 > Scheffenegger, Richard wrote: >>> Rick wrote: >>> Hi Rick, >>>=20 >>>> Well, I have some good news and some bad news (the bad is mostly = for Richard). >>>>=20 >>>> The only message logged is: >>>> tcpflags 0x4; tcp_do_segment: Timestamp missing, segment = processed normally >>>>=20 > Btw, I did get one additional message during further testing (with = r367492 reverted): > tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, = connection attempt aborted > by remote endpoint >=20 > This only happened once of several test cycles. That is OK. >=20 >>>> But...the RST battle no longer occurs. Just one RST that works and = then the SYN gets SYN,ACK'd by the FreeBSD end and off it goes... >>>>=20 >>>> So, what is different? >>>>=20 >>>> r367492 is reverted from the FreeBSD server. >>>> I did the revert because I think it might be what otis@ hang is = being caused by. (In his case, the Recv-Q grows on the socket for the = stuck Linux client, while others work. >>>>=20 >>>> Why does reverting fix this? >>>> My only guess is that the krpc gets the upcall right away and sees = a EPIPE when it does soreceive()->results in soshutdown(SHUT_WR). > This was bogus and incorrect. The diagnostic printf() I saw was = generated for the > back channel, and that would have occurred after the socket was shut = down. >=20 >>>=20 >>> With r367492 you don't get the upcall with the same error state? Or = you don't get an error on a write() call, when there should be one? > If Send-Q is 0 when the network is partitioned, after healing, the = krpc sees no activity on > the socket (until it acquires/processes an RPC it will not do a = sosend()). > Without the 6minute timeout, the RST battle goes on "forever" (I've = never actually > waited more than 30minutes, which is close enough to "forever" for = me). > --> With the 6minute timeout, the "battle" stops after 6minutes, when = the timeout > causes a soshutdown(..SHUT_WR) on the socket. > (Since the soshutdown() patch is not yet in "main". I got = comments, but no "reviewed" > on it, the 6minute timer won't help if enabled in main. The = soclose() won't happen > for TCP connections with the back channel enabled, such as Linux = 4.1/4.2 ones.) I'm confused. So you are saying that if the Send-Q is empty when you = partition the network, and the peer starts to send SYNs after the healing, FreeBSD = responds with a challenge ACK which triggers the sending of a RST by Linux. This = RST is ignored multiple times. Is that true? Even with my patch for the the bug I introduced? What version of the kernel are you using? Best regards Michael >=20 > If Send-Q is non-empty when the network is partitioned, the battle = will not happen. >=20 >>=20 >> My understanding is that he needs this error indication when calling = shutdown(). > There are several ways the krpc notices that a TCP connection is no = longer functional. > - An error return like EPIPE from either sosend() or soreceive(). > - A return of 0 from soreceive() with no data (normal EOF from other = end). > - A 6minute timeout on the server end, when no activity has occurred = on the > connection. This timer is currently disabled for NFSv4.1/4.2 mounts = in "main", > but I enabled it for this testing, to stop the "RST battle goes on = forever" > during testing. I am thinking of enabling it on "main", but this = crude bandaid > shouldn't be thought of as a "fix for the RST battle". >=20 >>>=20 >>> =46rom what you describe, this is on writes, isn't it? (I'm asking, = at the original problem that was fixed with r367492, occurs in the read = path (draining of ths so_rcv buffer in the upcall right away, which = subsequently influences the ACK sent by the stack). >>>=20 >>> I only added the so_snd buffer after some discussion, if the WAKESOR = shouldn't have a symmetric equivalent on WAKESOW.... >>>=20 >>> Thus a partial backout (leaving the WAKESOR part inside, but = reverting the WAKESOW part) would still fix my initial problem about = erraneous DSACKs (which can also lead to extremely poor performance with = Linux clients), but possible address this issue... >>>=20 >>> Can you perhaps take MAIN and apply = https://reviews.freebsd.org/D29690 for the revert only on the so_snd = upcall? > Since the krpc only uses receive upcalls, I don't see how reverting = the send side would have > any effect? >=20 >> Since the release of 13.0 is almost done, can we try to fix the issue = instead of reverting the commit? > I think it has already shipped broken. > I don't know if an errata is possible, or if it will be broken until = 13.1. >=20 > --> I am much more concerned with the otis@ stuck client problem than = this RST battle that only > occurs after a network partitioning, especially if it is 13.0 = specific. > I did this testing to try to reproduce Jason's stuck client = (with connection in CLOSE_WAIT) > problem, which I failed to reproduce. >=20 > rick >=20 > Rs: agree, a good understanding where the interaction btwn stack, = socket and in kernel tcp user breaks is needed; >=20 >>=20 >> If this doesn't help, some major surgery will be necessary to prevent = NFS sessions with SACK enabled, to transmit DSACKs... >=20 > My understanding is that the problem is related to getting a local = error indication after > receiving a RST segment too late or not at all. >=20 > Rs: but the move of the upcall should not materially change that; i = don=E2=80=99t have a pc here to see if any upcall actually happens on = rst... >=20 > Best regards > Michael >>=20 >>=20 >>> I know from a printf that this happened, but whether it caused the = RST battle to not happen, I don't know. >>>=20 >>> I can put r367492 back in and do more testing if you'd like, but I = think it probably needs to be reverted? >>=20 >> Please, I don't quite understand why the exact timing of the upcall = would be that critical here... >>=20 >> A comparison of the soxxx calls and errors between the "good" and the = "bad" would be perfect. I don't know if this is easy to do though, as = these calls appear to be scattered all around the RPC / NFS source = paths. >>=20 >>> This does not explain the original hung Linux client problem, but = does shed light on the RST war I could create by doing a network = partitioning. >>>=20 >>> rick >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@freebsd.org Sat Apr 10 18:15:08 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2B2695D41D0 for ; Sat, 10 Apr 2021 18:15:08 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2048.outbound.protection.outlook.com [40.107.223.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHjpL5GcHz3qZ4; Sat, 10 Apr 2021 18:15:06 +0000 (UTC) (envelope-from Richard.Scheffenegger@netapp.com) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=H+NV26+bv5iEmfVVqWRgk93WL69QzHO8pJM/el0uvvesUH0iy1D46hYKYGXslrti/tK+roaxoF67/0XC4q5vcotRd4RZ/n+BbYyCEU7rVzygAMy4B+ZEDaENzw/dvHUwLupZ2iBVuILyFgiyIRFS0MnxnLIbAWYO5taDhydNmgs9TLHUB3FMQDbK+fENGQftY6TV5uWDL58yk/ffzq79BHQqTHUkeigN34MQfnMp8wmh8dAzEWNyBa4I3yzrZKd4GKTSOPt9jjQVLeGu8/5Fcr1ETBiJwdmKGJtHVBzNZxuYgtKrt+p6Fw0r2fBD6DoUSLap6r4KgXPFJxKNCAojsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=my+pu+xm/fIO6OKYi/46LukBrsMI/fVGjOse0QAfMXQ=; b=ZRKCrc4Fo8A0J0ybH3m8cKGBZN15SLTRYTzqkVzR8Wly77/oA+Eu4TlZBEY0bQw8Vn6PkimGdZ5xUasM0+LVlG5oARNIH7wyZ0Urcb3iqVOEduL47b5GNowxMaDarFG326JmxQOJ+t4OWkr3UnHYOXn3zERztZpNRQfSpt+Gvzwbv5bN/jtwL2TS62NP4KCBFQfrauTwFIsrP26loXBaSaVKdYgl9vnKNZ6QdASNjsrqoSZpyIj16Ib2EsOgBPmuR5INXitX2z20fmRqSf9B7FZoUujWLjMQrkOWKgW16lU6cssEQh1RxgT0g5AVbTRarzUDYTuHNQLHNeaYJVSb6w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=netapp.com; dmarc=pass action=none header.from=netapp.com; dkim=pass header.d=netapp.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netapp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=my+pu+xm/fIO6OKYi/46LukBrsMI/fVGjOse0QAfMXQ=; b=kM4s5SU8/IbDaAidJBJdAu7POa+TGtose4Vh1tY0znwxqoWuD7q6dTQcyUEcfCwZnQJbAjkip6Hcj6EKvnO6DZ9++xq/spk0CrrWQnpzQb/SRMOHPlnFtb9NUExfeFialLQutD+zD7mK/0aGQck/ecqSG8aEBUUS1T3+SyMRHF69elFaXHBk3QUKHlx2V7Rpa1GsktcS+x/3siiLjMVUnV7w04kexHCCcTH5N2wGL/rnW180QoMNgTmAOB0XPUX9Q+ShlggOqQAqfiB1ekE55sgwJ6lmnfMzPi7AOUr4ladUPCPgn/+RqPELhMhM4+w916d0GlZTEQOJdBliOyuOxQ== Received: from SN4PR0601MB3728.namprd06.prod.outlook.com (2603:10b6:803:51::24) by SN4PR0601MB3728.namprd06.prod.outlook.com (2603:10b6:803:51::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.18; Sat, 10 Apr 2021 18:15:02 +0000 Received: from SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef]) by SN4PR0601MB3728.namprd06.prod.outlook.com ([fe80::ccb:944d:e270:63ef%6]) with mapi id 15.20.4020.021; Sat, 10 Apr 2021 18:15:02 +0000 From: "Scheffenegger, Richard" To: "tuexen@freebsd.org" , Rick Macklem CC: Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: AW: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1GB6agsoGWN0UqRoZFo/qoHTaqMDIkAgAL97ICACMXzgIAAsfOAgAfvbwCAAQ5PAIAAWDiAgAKBMZWAAD3WgIAAFNIAgAA/e4CAABvaAIAAEe2AgAEE0ACAAJCpAIAAgu0AgAXcwwCAAH0H4IAARSaAgAAmg3iAABY/gIAABIEAgAAhDYA= Date: Sat, 10 Apr 2021 18:15:02 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org> <3980F368-098D-4EE4-B213-4113C2CAFE7D@freebsd.org> In-Reply-To: <3980F368-098D-4EE4-B213-4113C2CAFE7D@freebsd.org> Accept-Language: de-AT, en-US Content-Language: de-DE X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [217.70.211.16] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: ea17fc55-e811-4bf7-6f18-08d8fc4c8f62 x-ms-traffictypediagnostic: SN4PR0601MB3728: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:8882; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: fOIM/5H7tSiT6W0eUg+ZEY8gH5I7+FVm/bFA1bchN4eFZMKfvJR+X1rgwLsHOYNy9DKKMY1WIN5VlHWlUigzYtsHLa93eHDhGZdws60OdCb9TnC6hC1uRxntdE/u/bDZcxxsRqKueh/14vFcmgzxNXvajRlzk9p1URpztnld/41KYaworNl6pAr+BbIE7hubLUMbzEjxD1+LAOzYOSCr0PJrr5tz0XUdUkyUrUai5CV6D3fAZBTWZf0BqSF0BA+eHrnNk2W5Bmgq6NQdE29L5obNl+ULUKcPOxT8KeLTTt3WHiUf61U8d92yP7Tkm/MvARBYXa7UUZlsHmlq2wMRx0yyvTqC1X5opxZ3Fj2517aXsKiTwevosWxws+Bq14Crr2CnRiEhtsmYnKcJ2PIvkvBX1W0B/Hfk/b19VbAUDkcFDn+oixVro5+T2fnL7GRK5FxzNwq4ohRVfTCVXaBgLnooRMai8rImaa+XWBYxcwSZpgPE92975OGo8e81mrie8B7JT80prJ9sLNUvDeH1+h5qi+LXL4jOPZuNWHQI6I6hPbpWzvMyvGMGkyN6s6VYmMgHAuT6NwTyKaGeBlY8xVPMjxxGtbyyGG535YXdXcfDnLFFJawikF5vrEQGmqyiFeQQUh5MFKGxZHMVYhLvYmp+GuKwwBSpGYccOfe/wjR2w8QDrtyq+dalyvVyF42C x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN4PR0601MB3728.namprd06.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39860400002)(396003)(376002)(346002)(136003)(366004)(186003)(316002)(4326008)(8676002)(7696005)(26005)(54906003)(33656002)(8936002)(110136005)(7116003)(478600001)(71200400001)(66574015)(9686003)(86362001)(66556008)(966005)(76116006)(83380400001)(66946007)(64756008)(6506007)(2906002)(5660300002)(38100700002)(55016002)(66446008)(66476007)(52536014); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?utf-8?B?M0pIUlRTYlRYZitkanNIYWFKWUdINlNVQkdRRHN5Y2VjL3NFaGwxNTNLT212?= =?utf-8?B?UWNwL2dJVitrZEtmS1FMRGNVU2Q2and6dDQzclhDdzEvSzBBbHVGRDZKWmFO?= =?utf-8?B?OEhpd3NycjBaN0U3ODNVc3p4dUQvOWRBZEpKaExzTFc3ODQ0RzFJMngxUk5o?= =?utf-8?B?NlBoak1BYnVjbnJDUlp1YkdFT2pMOUR5WUxaaEwxMW45TTVSNytTUy9mWUE5?= =?utf-8?B?OTUwL2RLYUQvQ2ZPMXdqa1BnV2lxRCs3NFcrM214cktJaUlMNUpvYThqOEkx?= =?utf-8?B?cGYvMEVVSzZVTDNzMGF0ZCt3T2VBZ3ZWYUNLcGVSUkg2NzdNTnRDSlhSZ0h5?= =?utf-8?B?ZUdPY2FqaDRiSUQwSENhbmdLQWJYU1hlM1YwZ2FZK3k5QkV6K0FxN2tOMklz?= =?utf-8?B?a0tjSG94UnBGRDFIclVwVVhvWFgvTWg4VEgwL2tVblQ4KzI0d3BYTFFJNnUr?= =?utf-8?B?R3hISUVyNHNpMGJ5ZUtuRzdkaDBCWlZZdlVuSVcwelM5eDhqWWVmalg5ZHBE?= =?utf-8?B?bTl1dm5IR3VpYkUrZVJNVlliQTRuelBkZGFKMlZweW9ZZEpwb1pTU2hjanFu?= =?utf-8?B?S0FZSDVzVkZJVXNVei8raHBiaTFwcHNkVWRmSEZUYnRPZzdON0dmYlhFbUw4?= =?utf-8?B?ZmVmRVAyUmhBMGdVTVExRit6TXRySDlFellhQlk2aVRtZk1ad2N6OHZaa2tt?= =?utf-8?B?WTk3S3pOVlVUNVBaeFJENkRhNThwcERNeFQ0Uk5YSS8wRCszbDgvVVFrOXpi?= =?utf-8?B?Z1k0V1J1RFBKMHI1WEl3VDJ1dVQwOThwOTJWb3ZPVzNpbTR2azREdU5hRUhn?= =?utf-8?B?eWVpTmE0WHI2MlRvK0ozbFRhNlZTcnExUjBTalp2T0tFNWtmUUdscEJVYzZE?= =?utf-8?B?djRZVEY3NkR0SDEzd09aZGtVRmpLNjJOK2s1ckdCcmxNMlRvdEtQUEhZSGVl?= =?utf-8?B?Z2pHRkJlaEszTlhVdWN6SnI5QVA5aHVuOG9QVEZoU3d5RVRXQ1IrbHE1TGFJ?= =?utf-8?B?eUFEMDBBaktCZmorQ3lwVWcyYnJLNlRmTkk1MzNWakJJbzdvN0ZCdkIwMkdM?= =?utf-8?B?NVlUd1U0c0d4b2VKNyttOVh3dUZ4ZHlKTWhJNHdlZ0FaUmJXU1BCS1FoTXF0?= =?utf-8?B?OHdUc2dOMTNtYVhqVWQrNFFJZDMrd3lJcDJPc1hFRE53cUpEN0g3SEtuYUpr?= =?utf-8?B?a2tjc05CNGR4N2g1SU9Nc1ZZTUgzbytFVVFhRlNKaVpvV2VWUXpGajRTNE1k?= =?utf-8?B?MTJCbnhENHpOUjhtM0tFV1EyeTdLSE1HTHZEWVI4OS8rYkFaT2p4YVBZQVNI?= =?utf-8?B?K3VIMXVZSTUvMlpYZ0FPVDNGUGtKRTdOOFJkNC9jM1RRMktYcElXNVVTaFV0?= =?utf-8?B?bVBXYkZLK0NPYWdLemJQazh1VXdVSXMraHdYQVIrZy93ZUhpdDQ5dXdHV2t6?= =?utf-8?B?NXRZL2VvUjFqSFMwR1JuUFRXZmFJcmNvVllHQ1VUMnFIN1prcHd1YkNUT3Ex?= =?utf-8?B?Q0pHa3RwTUZqUkZ4WDk0WFZnNjVmMmRVbVJpOTNtWVUvTzF2Z0NjYWtWU1VJ?= =?utf-8?B?czdVcjdEcjBsajBVeUZaS2JzTlhYRXJtR0o5TE9CMWVpR2k1UERrbUV5RklJ?= =?utf-8?B?d3g0eDVZRlpqL2VldmhTMWwxSVNOUFByYlJyVUY4V2F4VmpudlFJWFBMWHhW?= =?utf-8?B?M0xDYXhVUTg1dG95TngyMHFQcTYxZW5HalRtdXdCQSt2ajRZNnhhWlN5eDhs?= =?utf-8?Q?j/UOgFGo8Ol9MaecH8Cp8xqpFxxnk6hmUGvTNio?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-OriginatorOrg: netapp.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN4PR0601MB3728.namprd06.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: ea17fc55-e811-4bf7-6f18-08d8fc4c8f62 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2021 18:15:02.4672 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 4b0911a0-929b-4715-944b-c03745165b3a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: IX6UzIOTtywhHZo++L9wtMOc1RdgoGpnWvZmC+nDZrGOGAL4IIJ+c+Jn2Y2+Id0effjbWbBLDF8/e8m6KnWo1Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN4PR0601MB3728 X-Rspamd-Queue-Id: 4FHjpL5GcHz3qZ4 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=netapp.com header.s=selector1 header.b=kM4s5SU8; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=netapp.com; spf=pass (mx1.freebsd.org: domain of Richard.Scheffenegger@netapp.com designates 40.107.223.48 as permitted sender) smtp.mailfrom=Richard.Scheffenegger@netapp.com X-Spamd-Result: default: False [-3.90 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; HAS_XOIP(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[netapp.com:+]; MIME_BASE64_TEXT(0.10)[]; DMARC_POLICY_ALLOW(-0.50)[netapp.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[40.107.223.48:from]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[netapp.com:s=selector1]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[40.107.223.48:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[netapp.com:dkim]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[40.107.223.48:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.223.48:from]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 18:15:08 -0000 SSB3ZW50IHRocm91Z2ggYWxsIHRoZSBpbnN0YW5jZXMsIHdoZXJlIHRoZXJlIHdvdWxkIGJlIGFu IGltbWVkaWF0ZSBzb3VwY2FsbCB0cmlnZ2VyZWQgKGJlZm9yZSByMzY3NDkyKS4NCg0KSWYgdGhl IHByb2JsZW0gaXMgcmVsYXRlZCB0byBhIHJhY2UgY29uZGl0aW9uLCB3aGVyZSB0aGUgc29ja2V0 IGlzIHVubG9ja2VkIGJlZm9yZSB0aGUgdXBjYWxsLCBJIGNhbiBjaGFuZ2UgdGhlIHBhdGNoIGlu IHN1Y2ggYSB3YXksIHRvIHJldGFpbiB0aGUgbG9jayBvbiB0aGUgc29ja2V0IGFsbCB0aHJvdWdo IFRDUCBwcm9jZXNzaW5nLg0KDQpCb3RoIHNvcndha2V1cHMgYXJlIHdpdGggYSBsb2NrZWQgc29j a2V0ICh3aGljaCBpcyB0aGUgY3JpdGljYWwgcGFydCwgSSB1bmRlcnN0YW5kKSwgd2hpbGUgZm9y IHRoZSB3cml0ZSB1cGNhbGwgdGhlcmUgaXMgb25lIHVubG9ja2VkLCBhbmQgb25lIGxvY2tlZC4u Li4NCg0KDQpSaWNoYXJkIFNjaGVmZmVuZWdnZXINCkNvbnN1bHRpbmcgU29sdXRpb24gQXJjaGl0 ZWN0DQpOQVMgJiBOZXR3b3JraW5nDQoNCk5ldEFwcA0KKzQzIDEgMzY3NiA4MTEgMzE1NyBEaXJl Y3QgUGhvbmUNCis0M8KgNjY0IDg4NjYgMTg1NyBNb2JpbGUgUGhvbmUNClJpY2hhcmQuU2NoZWZm ZW5lZ2dlckBuZXRhcHAuY29tDQoNCmh0dHBzOi8vdHMubGEvcmljaGFyZDQ5ODkyDQoNCg0KLS0t LS1VcnNwcsO8bmdsaWNoZSBOYWNocmljaHQtLS0tLQ0KVm9uOiB0dWV4ZW5AZnJlZWJzZC5vcmcg PHR1ZXhlbkBmcmVlYnNkLm9yZz4gDQpHZXNlbmRldDogU2Ftc3RhZywgMTAuIEFwcmlsIDIwMjEg MTg6MTMNCkFuOiBSaWNrIE1hY2tsZW0gPHJtYWNrbGVtQHVvZ3VlbHBoLmNhPg0KQ2M6IFNjaGVm ZmVuZWdnZXIsIFJpY2hhcmQgPFJpY2hhcmQuU2NoZWZmZW5lZ2dlckBuZXRhcHAuY29tPjsgWW91 c3NlZiBHSE9SQkFMIDx5b3Vzc2VmLmdob3JiYWxAcGFzdGV1ci5mcj47IGZyZWVic2QtbmV0QGZy ZWVic2Qub3JnDQpCZXRyZWZmOiBSZTogTkZTIE1vdW50IEhhbmdzDQoNCk5ldEFwcCBTZWN1cml0 eSBXQVJOSU5HOiBUaGlzIGlzIGFuIGV4dGVybmFsIGVtYWlsLiBEbyBub3QgY2xpY2sgbGlua3Mg b3Igb3BlbiBhdHRhY2htZW50cyB1bmxlc3MgeW91IHJlY29nbml6ZSB0aGUgc2VuZGVyIGFuZCBr bm93IHRoZSBjb250ZW50IGlzIHNhZmUuDQoNCg0KDQoNCj4gT24gMTAuIEFwciAyMDIxLCBhdCAx Nzo1NiwgUmljayBNYWNrbGVtIDxybWFja2xlbUB1b2d1ZWxwaC5jYT4gd3JvdGU6DQo+DQo+IFNj aGVmZmVuZWdnZXIsIFJpY2hhcmQgPFJpY2hhcmQuU2NoZWZmZW5lZ2dlckBuZXRhcHAuY29tPiB3 cm90ZToNCj4+PiBSaWNrIHdyb3RlOg0KPj4+IEhpIFJpY2ssDQo+Pj4NCj4+Pj4gV2VsbCwgSSBo YXZlIHNvbWUgZ29vZCBuZXdzIGFuZCBzb21lIGJhZCBuZXdzICh0aGUgYmFkIGlzIG1vc3RseSBm b3IgUmljaGFyZCkuDQo+Pj4+DQo+Pj4+IFRoZSBvbmx5IG1lc3NhZ2UgbG9nZ2VkIGlzOg0KPj4+ PiB0Y3BmbGFncyAweDQ8UlNUPjsgdGNwX2RvX3NlZ21lbnQ6IFRpbWVzdGFtcCBtaXNzaW5nLCBz ZWdtZW50IA0KPj4+PiBwcm9jZXNzZWQgbm9ybWFsbHkNCj4+Pj4NCj4gQnR3LCBJIGRpZCBnZXQg b25lIGFkZGl0aW9uYWwgbWVzc2FnZSBkdXJpbmcgZnVydGhlciB0ZXN0aW5nICh3aXRoIHIzNjc0 OTIgcmV2ZXJ0ZWQpOg0KPiB0Y3BmbGFncyAweDQ8UlNUPjsgc3luY2FjaGVfY2hrcnN0OiBPdXIg U1lOfEFDSyB3YXMgcmVqZWN0ZWQsIGNvbm5lY3Rpb24gYXR0ZW1wdCBhYm9ydGVkDQo+ICAgYnkg cmVtb3RlIGVuZHBvaW50DQo+DQo+IFRoaXMgb25seSBoYXBwZW5lZCBvbmNlIG9mIHNldmVyYWwg dGVzdCBjeWNsZXMuDQpUaGF0IGlzIE9LLg0KPg0KPj4+PiBCdXQuLi50aGUgUlNUIGJhdHRsZSBu byBsb25nZXIgb2NjdXJzLiBKdXN0IG9uZSBSU1QgdGhhdCB3b3JrcyBhbmQgdGhlbiB0aGUgU1lO IGdldHMgU1lOLEFDSydkIGJ5IHRoZSBGcmVlQlNEIGVuZCBhbmQgb2ZmIGl0IGdvZXMuLi4NCj4+ Pj4NCj4+Pj4gU28sIHdoYXQgaXMgZGlmZmVyZW50Pw0KPj4+Pg0KPj4+PiByMzY3NDkyIGlzIHJl dmVydGVkIGZyb20gdGhlIEZyZWVCU0Qgc2VydmVyLg0KPj4+PiBJIGRpZCB0aGUgcmV2ZXJ0IGJl Y2F1c2UgSSB0aGluayBpdCBtaWdodCBiZSB3aGF0IG90aXNAIGhhbmcgaXMgYmVpbmcgY2F1c2Vk IGJ5LiAoSW4gaGlzIGNhc2UsIHRoZSBSZWN2LVEgZ3Jvd3Mgb24gdGhlIHNvY2tldCBmb3IgdGhl IHN0dWNrIExpbnV4IGNsaWVudCwgd2hpbGUgb3RoZXJzIHdvcmsuDQo+Pj4+DQo+Pj4+IFdoeSBk b2VzIHJldmVydGluZyBmaXggdGhpcz8NCj4+Pj4gTXkgb25seSBndWVzcyBpcyB0aGF0IHRoZSBr cnBjIGdldHMgdGhlIHVwY2FsbCByaWdodCBhd2F5IGFuZCBzZWVzIGEgRVBJUEUgd2hlbiBpdCBk b2VzIHNvcmVjZWl2ZSgpLT5yZXN1bHRzIGluIHNvc2h1dGRvd24oU0hVVF9XUikuDQo+IFRoaXMg d2FzIGJvZ3VzIGFuZCBpbmNvcnJlY3QuIFRoZSBkaWFnbm9zdGljIHByaW50ZigpIEkgc2F3IHdh cyANCj4gZ2VuZXJhdGVkIGZvciB0aGUgYmFjayBjaGFubmVsLCBhbmQgdGhhdCB3b3VsZCBoYXZl IG9jY3VycmVkIGFmdGVyIHRoZSBzb2NrZXQgd2FzIHNodXQgZG93bi4NCj4NCj4+Pg0KPj4+IFdp dGggcjM2NzQ5MiB5b3UgZG9uJ3QgZ2V0IHRoZSB1cGNhbGwgd2l0aCB0aGUgc2FtZSBlcnJvciBz dGF0ZT8gT3IgeW91IGRvbid0IGdldCBhbiBlcnJvciBvbiBhIHdyaXRlKCkgY2FsbCwgd2hlbiB0 aGVyZSBzaG91bGQgYmUgb25lPw0KPiBJZiBTZW5kLVEgaXMgMCB3aGVuIHRoZSBuZXR3b3JrIGlz IHBhcnRpdGlvbmVkLCBhZnRlciBoZWFsaW5nLCB0aGUgDQo+IGtycGMgc2VlcyBubyBhY3Rpdml0 eSBvbiB0aGUgc29ja2V0ICh1bnRpbCBpdCBhY3F1aXJlcy9wcm9jZXNzZXMgYW4gUlBDIGl0IHdp bGwgbm90IGRvIGEgc29zZW5kKCkpLg0KPiBXaXRob3V0IHRoZSA2bWludXRlIHRpbWVvdXQsIHRo ZSBSU1QgYmF0dGxlIGdvZXMgb24gImZvcmV2ZXIiIChJJ3ZlIA0KPiBuZXZlciBhY3R1YWxseSB3 YWl0ZWQgbW9yZSB0aGFuIDMwbWludXRlcywgd2hpY2ggaXMgY2xvc2UgZW5vdWdoIHRvICJmb3Jl dmVyIiBmb3IgbWUpLg0KPiAtLT4gV2l0aCB0aGUgNm1pbnV0ZSB0aW1lb3V0LCB0aGUgImJhdHRs ZSIgc3RvcHMgYWZ0ZXIgNm1pbnV0ZXMsIHdoZW4gDQo+IC0tPiB0aGUgdGltZW91dA0KPiAgICAg IGNhdXNlcyBhIHNvc2h1dGRvd24oLi5TSFVUX1dSKSBvbiB0aGUgc29ja2V0Lg0KPiAgICAgIChT aW5jZSB0aGUgc29zaHV0ZG93bigpIHBhdGNoIGlzIG5vdCB5ZXQgaW4gIm1haW4iLiBJIGdvdCBj b21tZW50cywgYnV0IG5vICJyZXZpZXdlZCINCj4gICAgICAgb24gaXQsIHRoZSA2bWludXRlIHRp bWVyIHdvbid0IGhlbHAgaWYgZW5hYmxlZCBpbiBtYWluLiBUaGUgc29jbG9zZSgpIHdvbid0IGhh cHBlbg0KPiAgICAgICBmb3IgVENQIGNvbm5lY3Rpb25zIHdpdGggdGhlIGJhY2sgY2hhbm5lbCBl bmFibGVkLCBzdWNoIGFzIExpbnV4IA0KPiA0LjEvNC4yIG9uZXMuKQ0KSSdtIGNvbmZ1c2VkLiBT byB5b3UgYXJlIHNheWluZyB0aGF0IGlmIHRoZSBTZW5kLVEgaXMgZW1wdHkgd2hlbiB5b3UgcGFy dGl0aW9uIHRoZSBuZXR3b3JrLCBhbmQgdGhlIHBlZXIgc3RhcnRzIHRvIHNlbmQgU1lOcyBhZnRl ciB0aGUgaGVhbGluZywgRnJlZUJTRCByZXNwb25kcyB3aXRoIGEgY2hhbGxlbmdlIEFDSyB3aGlj aCB0cmlnZ2VycyB0aGUgc2VuZGluZyBvZiBhIFJTVCBieSBMaW51eC4gVGhpcyBSU1QgaXMgaWdu b3JlZCBtdWx0aXBsZSB0aW1lcy4NCklzIHRoYXQgdHJ1ZT8gRXZlbiB3aXRoIG15IHBhdGNoIGZv ciB0aGUgdGhlIGJ1ZyBJIGludHJvZHVjZWQ/DQpXaGF0IHZlcnNpb24gb2YgdGhlIGtlcm5lbCBh cmUgeW91IHVzaW5nPw0KDQpCZXN0IHJlZ2FyZHMNCk1pY2hhZWwNCj4NCj4gSWYgU2VuZC1RIGlz IG5vbi1lbXB0eSB3aGVuIHRoZSBuZXR3b3JrIGlzIHBhcnRpdGlvbmVkLCB0aGUgYmF0dGxlIHdp bGwgbm90IGhhcHBlbi4NCj4NCj4+DQo+PiBNeSB1bmRlcnN0YW5kaW5nIGlzIHRoYXQgaGUgbmVl ZHMgdGhpcyBlcnJvciBpbmRpY2F0aW9uIHdoZW4gY2FsbGluZyBzaHV0ZG93bigpLg0KPiBUaGVy ZSBhcmUgc2V2ZXJhbCB3YXlzIHRoZSBrcnBjIG5vdGljZXMgdGhhdCBhIFRDUCBjb25uZWN0aW9u IGlzIG5vIGxvbmdlciBmdW5jdGlvbmFsLg0KPiAtIEFuIGVycm9yIHJldHVybiBsaWtlIEVQSVBF IGZyb20gZWl0aGVyIHNvc2VuZCgpIG9yIHNvcmVjZWl2ZSgpLg0KPiAtIEEgcmV0dXJuIG9mIDAg ZnJvbSBzb3JlY2VpdmUoKSB3aXRoIG5vIGRhdGEgKG5vcm1hbCBFT0YgZnJvbSBvdGhlciBlbmQp Lg0KPiAtIEEgNm1pbnV0ZSB0aW1lb3V0IG9uIHRoZSBzZXJ2ZXIgZW5kLCB3aGVuIG5vIGFjdGl2 aXR5IGhhcyBvY2N1cnJlZCANCj4gb24gdGhlICBjb25uZWN0aW9uLiBUaGlzIHRpbWVyIGlzIGN1 cnJlbnRseSBkaXNhYmxlZCBmb3IgTkZTdjQuMS80LjIgDQo+IG1vdW50cyBpbiAibWFpbiIsICBi dXQgSSBlbmFibGVkIGl0IGZvciB0aGlzIHRlc3RpbmcsIHRvIHN0b3AgdGhlICJSU1QgYmF0dGxl IGdvZXMgb24gZm9yZXZlciINCj4gIGR1cmluZyB0ZXN0aW5nLiBJIGFtIHRoaW5raW5nIG9mIGVu YWJsaW5nIGl0IG9uICJtYWluIiwgYnV0IHRoaXMgDQo+IGNydWRlIGJhbmRhaWQgIHNob3VsZG4n dCBiZSB0aG91Z2h0IG9mIGFzIGEgImZpeCBmb3IgdGhlIFJTVCBiYXR0bGUiLg0KPg0KPj4+DQo+ Pj4gRnJvbSB3aGF0IHlvdSBkZXNjcmliZSwgdGhpcyBpcyBvbiB3cml0ZXMsIGlzbid0IGl0PyAo SSdtIGFza2luZywgYXQgdGhlIG9yaWdpbmFsIHByb2JsZW0gdGhhdCB3YXMgZml4ZWQgd2l0aCBy MzY3NDkyLCBvY2N1cnMgaW4gdGhlIHJlYWQgcGF0aCAoZHJhaW5pbmcgb2YgdGhzIHNvX3JjdiBi dWZmZXIgaW4gdGhlIHVwY2FsbCByaWdodCBhd2F5LCB3aGljaCBzdWJzZXF1ZW50bHkgaW5mbHVl bmNlcyB0aGUgQUNLIHNlbnQgYnkgdGhlIHN0YWNrKS4NCj4+Pg0KPj4+IEkgb25seSBhZGRlZCB0 aGUgc29fc25kIGJ1ZmZlciBhZnRlciBzb21lIGRpc2N1c3Npb24sIGlmIHRoZSBXQUtFU09SIHNo b3VsZG4ndCBoYXZlIGEgc3ltbWV0cmljIGVxdWl2YWxlbnQgb24gV0FLRVNPVy4uLi4NCj4+Pg0K Pj4+IFRodXMgYSBwYXJ0aWFsIGJhY2tvdXQgKGxlYXZpbmcgdGhlIFdBS0VTT1IgcGFydCBpbnNp ZGUsIGJ1dCByZXZlcnRpbmcgdGhlIFdBS0VTT1cgcGFydCkgd291bGQgc3RpbGwgZml4IG15IGlu aXRpYWwgcHJvYmxlbSBhYm91dCBlcnJhbmVvdXMgRFNBQ0tzICh3aGljaCBjYW4gYWxzbyBsZWFk IHRvIGV4dHJlbWVseSBwb29yIHBlcmZvcm1hbmNlIHdpdGggTGludXggY2xpZW50cyksIGJ1dCBw b3NzaWJsZSBhZGRyZXNzIHRoaXMgaXNzdWUuLi4NCj4+Pg0KPj4+IENhbiB5b3UgcGVyaGFwcyB0 YWtlIE1BSU4gYW5kIGFwcGx5IGh0dHBzOi8vcmV2aWV3cy5mcmVlYnNkLm9yZy9EMjk2OTAgZm9y IHRoZSByZXZlcnQgb25seSBvbiB0aGUgc29fc25kIHVwY2FsbD8NCj4gU2luY2UgdGhlIGtycGMg b25seSB1c2VzIHJlY2VpdmUgdXBjYWxscywgSSBkb24ndCBzZWUgaG93IHJldmVydGluZyANCj4g dGhlIHNlbmQgc2lkZSB3b3VsZCBoYXZlIGFueSBlZmZlY3Q/DQo+DQo+PiBTaW5jZSB0aGUgcmVs ZWFzZSBvZiAxMy4wIGlzIGFsbW9zdCBkb25lLCBjYW4gd2UgdHJ5IHRvIGZpeCB0aGUgaXNzdWUg aW5zdGVhZCBvZiByZXZlcnRpbmcgdGhlIGNvbW1pdD8NCj4gSSB0aGluayBpdCBoYXMgYWxyZWFk eSBzaGlwcGVkIGJyb2tlbi4NCj4gSSBkb24ndCBrbm93IGlmIGFuIGVycmF0YSBpcyBwb3NzaWJs ZSwgb3IgaWYgaXQgd2lsbCBiZSBicm9rZW4gdW50aWwgMTMuMS4NCj4NCj4gLS0+IEkgYW0gbXVj aCBtb3JlIGNvbmNlcm5lZCB3aXRoIHRoZSBvdGlzQCBzdHVjayBjbGllbnQgcHJvYmxlbSB0aGFu IA0KPiAtLT4gdGhpcyBSU1QgYmF0dGxlIHRoYXQgb25seQ0KPiAgICAgICBvY2N1cnMgYWZ0ZXIg YSBuZXR3b3JrIHBhcnRpdGlvbmluZywgZXNwZWNpYWxseSBpZiBpdCBpcyAxMy4wIHNwZWNpZmlj Lg0KPiAgICAgICBJIGRpZCB0aGlzIHRlc3RpbmcgdG8gdHJ5IHRvIHJlcHJvZHVjZSBKYXNvbidz IHN0dWNrIGNsaWVudCAod2l0aCBjb25uZWN0aW9uIGluIENMT1NFX1dBSVQpDQo+ICAgICAgIHBy b2JsZW0sIHdoaWNoIEkgZmFpbGVkIHRvIHJlcHJvZHVjZS4NCj4NCj4gcmljaw0KPg0KPiBSczog YWdyZWUsIGEgZ29vZCB1bmRlcnN0YW5kaW5nIHdoZXJlIHRoZSBpbnRlcmFjdGlvbiBidHduIHN0 YWNrLCANCj4gc29ja2V0IGFuZCBpbiBrZXJuZWwgdGNwIHVzZXIgYnJlYWtzIGlzIG5lZWRlZDsN Cj4NCj4+DQo+PiBJZiB0aGlzIGRvZXNuJ3QgaGVscCwgc29tZSBtYWpvciBzdXJnZXJ5IHdpbGwg YmUgbmVjZXNzYXJ5IHRvIHByZXZlbnQgTkZTIHNlc3Npb25zIHdpdGggU0FDSyBlbmFibGVkLCB0 byB0cmFuc21pdCBEU0FDS3MuLi4NCj4NCj4gTXkgdW5kZXJzdGFuZGluZyBpcyB0aGF0IHRoZSBw cm9ibGVtIGlzIHJlbGF0ZWQgdG8gZ2V0dGluZyBhIGxvY2FsIA0KPiBlcnJvciBpbmRpY2F0aW9u IGFmdGVyIHJlY2VpdmluZyBhIFJTVCBzZWdtZW50IHRvbyBsYXRlIG9yIG5vdCBhdCBhbGwuDQo+ DQo+IFJzOiBidXQgdGhlIG1vdmUgb2YgdGhlIHVwY2FsbCBzaG91bGQgbm90IG1hdGVyaWFsbHkg Y2hhbmdlIHRoYXQ7IGkgZG9u4oCZdCBoYXZlIGEgcGMgaGVyZSB0byBzZWUgaWYgYW55IHVwY2Fs bCBhY3R1YWxseSBoYXBwZW5zIG9uIHJzdC4uLg0KPg0KPiBCZXN0IHJlZ2FyZHMNCj4gTWljaGFl bA0KPj4NCj4+DQo+Pj4gSSBrbm93IGZyb20gYSBwcmludGYgdGhhdCB0aGlzIGhhcHBlbmVkLCBi dXQgd2hldGhlciBpdCBjYXVzZWQgdGhlIFJTVCBiYXR0bGUgdG8gbm90IGhhcHBlbiwgSSBkb24n dCBrbm93Lg0KPj4+DQo+Pj4gSSBjYW4gcHV0IHIzNjc0OTIgYmFjayBpbiBhbmQgZG8gbW9yZSB0 ZXN0aW5nIGlmIHlvdSdkIGxpa2UsIGJ1dCBJIHRoaW5rIGl0IHByb2JhYmx5IG5lZWRzIHRvIGJl IHJldmVydGVkPw0KPj4NCj4+IFBsZWFzZSwgSSBkb24ndCBxdWl0ZSB1bmRlcnN0YW5kIHdoeSB0 aGUgZXhhY3QgdGltaW5nIG9mIHRoZSB1cGNhbGwgd291bGQgYmUgdGhhdCBjcml0aWNhbCBoZXJl Li4uDQo+Pg0KPj4gQSBjb21wYXJpc29uIG9mIHRoZSBzb3h4eCBjYWxscyBhbmQgZXJyb3JzIGJl dHdlZW4gdGhlICJnb29kIiBhbmQgdGhlICJiYWQiIHdvdWxkIGJlIHBlcmZlY3QuIEkgZG9uJ3Qg a25vdyBpZiB0aGlzIGlzIGVhc3kgdG8gZG8gdGhvdWdoLCBhcyB0aGVzZSBjYWxscyBhcHBlYXIg dG8gYmUgc2NhdHRlcmVkIGFsbCBhcm91bmQgdGhlIFJQQyAvIE5GUyBzb3VyY2UgcGF0aHMuDQo+ Pg0KPj4+IFRoaXMgZG9lcyBub3QgZXhwbGFpbiB0aGUgb3JpZ2luYWwgaHVuZyBMaW51eCBjbGll bnQgcHJvYmxlbSwgYnV0IGRvZXMgc2hlZCBsaWdodCBvbiB0aGUgUlNUIHdhciBJIGNvdWxkIGNy ZWF0ZSBieSBkb2luZyBhIG5ldHdvcmsgcGFydGl0aW9uaW5nLg0KPj4+DQo+Pj4gcmljaw0KPj4N Cj4+IF9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQo+PiBm cmVlYnNkLW5ldEBmcmVlYnNkLm9yZyBtYWlsaW5nIGxpc3QNCj4+IGh0dHBzOi8vbGlzdHMuZnJl ZWJzZC5vcmcvbWFpbG1hbi9saXN0aW5mby9mcmVlYnNkLW5ldA0KPj4gVG8gdW5zdWJzY3JpYmUs IHNlbmQgYW55IG1haWwgdG8gImZyZWVic2QtbmV0LXVuc3Vic2NyaWJlQGZyZWVic2Qub3JnIg0K Pg0KPiBfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiBm cmVlYnNkLW5ldEBmcmVlYnNkLm9yZyBtYWlsaW5nIGxpc3QNCj4gaHR0cHM6Ly9saXN0cy5mcmVl YnNkLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2ZyZWVic2QtbmV0DQo+IFRvIHVuc3Vic2NyaWJlLCBz ZW5kIGFueSBtYWlsIHRvICJmcmVlYnNkLW5ldC11bnN1YnNjcmliZUBmcmVlYnNkLm9yZyINCg0K From owner-freebsd-net@freebsd.org Sat Apr 10 21:59:55 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8DB3A5D968E for ; Sat, 10 Apr 2021 21:59:55 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-TO1-obe.outbound.protection.outlook.com (mail-to1can01on0602.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5d::602]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "DigiCert Cloud Services CA-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FHpnk548fz4V22; Sat, 10 Apr 2021 21:59:53 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ANsnes1AOTI1mHvBgjmicwZE0MWpnNGyDUFyBplTgF41oGgOG4I/ugyxSTDtu+D2rIvuoF1m9VK3Sk/Pq8wTt9bS4WV8zYPvq5j0hwrqEgKMpdUEWmrdCRIAXjCnEnvtL5pFzWDpV/vHbbKC6TsCn49rpXYDhvO93AIjikk+YvkTZWSlHWr4kmoQFjqpWtmdjr03FEPBISTFLvXt/XEbM77122wpby3LJw+z9G2pBjZJadbzmBFidp9yiQ+eyvJ4uy6zf+Q1LLUyrGeOdikhf+CRnTTqbbhfYYqBcSjY8Gzf9J1ap5SK4I13pTWBcsAAmJQ3DQ0O/oN0k7YgnGJPMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iJ9E9SvpEVZLDzw1s3Xzs/c5jWNmT8yTLAlTxn2RRMA=; b=R5gs/OA2COr4cbnlMbITTDbFH2U4j3vZOXpjdLwkpbA1LJ2MjrVvi+SmYNC/65Nxr4/GvOgkZ5fFwddAurOFrqA5slLu3LzH8KVaPWH7SXZFcYBR0O3nt/RqgBdGToYt72u8WW7EtyMSiBCtBmUzeAobPoFN0oAmw54T14OApPELRCu7+WYx01WY8tPhb7Ilwp7nkiTE+7lfLrrKIOGv2ohryvTwz54qVb9jRA6M2MLcoL1/Mhz32rOn2pCT9bo91LW/ZYxlyn41SKtDOsqVI6s6SvFGfBhvmuBgyAMkU6IdmZ0ujbOEI1rYib7jrB21lGuRujzTnZMVOKNBkTrFJw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iJ9E9SvpEVZLDzw1s3Xzs/c5jWNmT8yTLAlTxn2RRMA=; b=BX7qbj5JyGA9vLf+39kxnXkZf7OIuXthTJ4rOUxHT7V3lgleR1emNhuAca7Sdk57HKwTLapSY/tKKnbX/vlz9HE4E3kU05ozxsP6A3emJialOMuf+cEUWGAI4T/Y7lztFbljVP/p/fcgrsbFSpN2X2mocXzOIZYaoA1xuaDkt9bO8dDjKZVxGNCd3r/n8iuol2qDDhvKy8u8Q+g2/cqGlzrPBcF3la/HDTuZuLPZj3C2DdwXHf2QH9EEa3npWXQB7MF2wNI4UnH7jK1+d1Pfq3nFGMVkUdbngXDA7544ilk27ud4oOJGdgPZgiigOpfdjfkal6r+G444YlsJ6Io+cQ== Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:19::29) by YQXPR0101MB2246.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:c00:21::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4020.22; Sat, 10 Apr 2021 21:59:51 +0000 Received: from YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e]) by YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM ([fe80::1c05:585a:132a:f08e%4]) with mapi id 15.20.3999.035; Sat, 10 Apr 2021 21:59:51 +0000 From: Rick Macklem To: "tuexen@freebsd.org" CC: "Scheffenegger, Richard" , Youssef GHORBAL , "freebsd-net@freebsd.org" Subject: Re: NFS Mount Hangs Thread-Topic: NFS Mount Hangs Thread-Index: AQHXG1G2D7AHBwtmAkS1jBAqNNo2I6qMDIgAgALy8kyACNDugIAAsfOAgAfoFLeAARWpAIAAUOsEgAKJ2oCAADW73YAAG5EAgAA+DUKAAB1JAIAACEqkgAEOcgCAAI4UZoAAhYMAgAXXgNmAAJVDAIAAMi2AgAAnewCAAAnWO4AAD/IAgABa2UE= Date: Sat, 10 Apr 2021 21:59:51 +0000 Message-ID: References: <3750001D-3F1C-4D9A-A9D9-98BCA6CA65A4@tildenparkcapital.com> <33693DE3-7FF8-4FAB-9A75-75576B88A566@tildenparkcapital.com> <8E745920-1092-4312-B251-B49D11FE8028@pasteur.fr> <765CE1CD-6AAB-4BEF-97C6-C2A1F0FF4AC5@freebsd.org> <2B189169-C0C9-4DE6-A01A-BE916F10BABA@freebsd.org> <077ECE2B-A84C-440D-AAAB-00293C841F14@freebsd.org> , <3980F368-098D-4EE4-B213-4113C2CAFE7D@freebsd.org> In-Reply-To: <3980F368-098D-4EE4-B213-4113C2CAFE7D@freebsd.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 0f56ddfb-a159-4ce3-32ac-08d8fc6bf765 x-ms-traffictypediagnostic: YQXPR0101MB2246: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:2201; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: olMviBitUQQEC8kHwTmk3O9iZnOT7A7+kOMr1+tnfmkQQBWyEjuOeA/bD4n+M40vALJFDvEUAl8IoZHG+G26ZGMQXIzdWqupO1RIhl2cWwpGeVFHIV2DGzo5GspxrfDPTLVywCAVhyRXY8NWnwGGB8h75rlygLYVvwS2RFrdEssxpMlXbI5nKWaBluMKms5QBIW1xXGKmHWcAPni4PUw1xVVODmDIFmkrQ9KfxXc+VWTtUb8sHU4aSbWOdp2gPOQOA7jNLdQFJK9b0WtYo431BVq2O2wVHQtKk7ca3rtr7aIYb3DhMHEZS7/q/GQdIM0xztErw8ZrMym4ruzrwvsGypnc8mhmwR7Ah8AoDEkUjw8j2eWMMkS4DjHBPT3xM+uSgDTBGBA5QYbyof5+bqTCHjEHshqwEp7hnMToIbmr6ghOEJQiFuZy34vf4W94bFV5yN0H8N1//H0FuimwvlkJm1JrODk/mmdsstpMTB2ZC2zjYhFcRS21iVUFng/8n56Ge6232CLZC7lR9e09GHYq3LwzRcGK02Jp/G1CaK7WEsxx9j0xKUGRkMlR+bnBQ6ifJdjDCcta6LZJROeQKeKW0cE4Zz9wWi9TJ8Clv14ROJSnVXHhTBJaH776iX2pSEb8pcOoMieT9onu9Dzm8P+de2l/CZZNEljAA9CI18sIo1IKvhAFqF0lYk5DDLIiRvf x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(366004)(136003)(376002)(39850400004)(396003)(346002)(6916009)(8936002)(5660300002)(33656002)(55016002)(8676002)(9686003)(86362001)(186003)(6506007)(478600001)(84040400003)(52536014)(7696005)(91956017)(66946007)(7116003)(76116006)(966005)(38100700002)(316002)(786003)(4326008)(83380400001)(54906003)(3480700007)(66556008)(66476007)(2906002)(71200400001)(66446008)(64756008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?Windows-1252?Q?1viUa+q6iIH7XW3/OW7S6eGLZ87Lu1gpqLBOPKIYXHNtmycNun73xvNF?= =?Windows-1252?Q?Ez25w4CvXkQKnswrwW2+58N0rzacZhOMJJNLw8QsTJg94HWtElpD7xkn?= =?Windows-1252?Q?LUq6CWCUmY4Jfg/WvDB3CBqPMntKqIkxLgeR4+IYp9T4i4viKP9Scfu0?= =?Windows-1252?Q?RVoPwPr01n6mF61aeIIrwS4b6lWInaDCyI/acPTvWGnHx8cxF9lr7nGj?= =?Windows-1252?Q?g6GATg1AvGYdldhUm7ucvAhZfImoXTxVtcFRLikAKboaarKQxTaZf3xX?= =?Windows-1252?Q?Cnn+m5gDJZtv8U2H4DPNOl8SpIvO9dxUKlP6fEh0bhCXF2QGzxzg4I/o?= =?Windows-1252?Q?+Pv6tj4IjRCdCly5X1DB7xK7oct4AkJvAgrk/HhsHPWlD93nHIRUqef3?= =?Windows-1252?Q?thuT74JIo17hwK942pEkQI7LOHRX3ebx355j9hQb5NQhSTYPfRh8awR2?= =?Windows-1252?Q?KU26FQjvKd96ODS2fCM0C5cXQhoYuUad6MKH5kixplXrSN5WjVbz6kRv?= =?Windows-1252?Q?9zIc/r9GaPWDlLhoBFsiuC0hQTtUeiVePpPrD/B//WcEAdxblvBA13II?= =?Windows-1252?Q?rH6VpKLJWpj4F00uPV/clqD42zjSg4Sfa/ZtDlDhltF3k0WW8QAW88Tz?= =?Windows-1252?Q?i0zej9b0yLuYW7fPID5D10V8R+Y/hJe9lM+7lqnMxL4Nr/XSgC7GgVdE?= =?Windows-1252?Q?ClpZmI8eQg9SP5GyWvAcB2xk4shPQsVJAGAIBhBVDmPAMvvopRl+r9B+?= =?Windows-1252?Q?/kIt/rDrmMnKY4yAFnVRwFlO0kkpLuc2M2cIcuKMEXG9exB1Ee4GFMY7?= =?Windows-1252?Q?TQOWbFWdBUdZk7lSNu972A0T8v35p11nuYhR/GZ6tnVOfzH612E3va0l?= =?Windows-1252?Q?3X+TTPV5sVp/En+u5zluU0obgCPohGVg/VrOl/1aWUbf07wBTxHGEYkE?= =?Windows-1252?Q?87/rEHkzsnphon4WoGOiE6R1qsiuptQr//qeYopYW21vCxCqZgqBiW/L?= =?Windows-1252?Q?7E32Uph78whHmAsK38p+25qlUxrbUdwCxwl5949jloZuCt68YAXjK1Qs?= =?Windows-1252?Q?i829MLpy6F9qFYdVL9ETpZ1gY2qwQ5z3BjnBfldb1rhpgepv4H9ZjI2i?= =?Windows-1252?Q?4lQYFScShMxIps5WE3UuFmzX7VOCvPFr/YxAVHLAG+LZmjWI8QwOeYBg?= =?Windows-1252?Q?MQvS4RjMFtji9E6yIzT4t3SAcdKnVuOtxEttxK9MytylqALSHHukPCwS?= =?Windows-1252?Q?d9T/csc7MsXCRF2i1Hl7FDgqubEZv/g7zzUgZx16JqYx4r54g2JnNO6t?= =?Windows-1252?Q?5i9CkZVnYQznPGVMsdk5e+sTT/RDONDGCzU6po1FzyU0eFXbzaAF5JjN?= =?Windows-1252?Q?KN2yGhg6oe2sYP+0uvUF+sK20jrQp/8vVyLw7r9nnygO1G+o7wHoShOQ?= =?Windows-1252?Q?1UAgNs98Ae2cWHNXqsvOAO6nG9VNrVMfsos8YGEvdAM=3D?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 0f56ddfb-a159-4ce3-32ac-08d8fc6bf765 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2021 21:59:51.4550 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: lYZ9kB08aC2F0hfGEymTyADOF0vuwJTC99O4dc/oxrGka2Ji3c+RMcjnKNO95PWGCJDUsS494VB3xRSPkd1okg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YQXPR0101MB2246 X-Rspamd-Queue-Id: 4FHpnk548fz4V22 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=BX7qbj5J; arc=pass (microsoft.com:s=arcselector9901:i=1); dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 2a01:111:f400:fe5d::602 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:111:f400:fe5d::602:from]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a01:111:f400::/48]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; SPAMHAUS_ZRD(0.00)[2a01:111:f400:fe5d::602:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; NEURAL_SPAM_LONG(1.00)[1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:2a01:111:f000::/36, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-net] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Apr 2021 21:59:55 -0000 tuexen@freebsd.org wrote:=0A= >Rick wrote:=0A= [stuff snipped]=0A= >>> With r367492 you don't get the upcall with the same error state? Or you= don't get an error on a write() call, when there should be one?=0A= > If Send-Q is 0 when the network is partitioned, after healing, the krpc s= ees no activity on=0A= > the socket (until it acquires/processes an RPC it will not do a sosend())= .=0A= > Without the 6minute timeout, the RST battle goes on "forever" (I've never= actually=0A= > waited more than 30minutes, which is close enough to "forever" for me).= =0A= > --> With the 6minute timeout, the "battle" stops after 6minutes, when the= timeout=0A= > causes a soshutdown(..SHUT_WR) on the socket.=0A= > (Since the soshutdown() patch is not yet in "main". I got comments, = but no "reviewed"=0A= > on it, the 6minute timer won't help if enabled in main. The soclose= () won't happen=0A= > for TCP connections with the back channel enabled, such as Linux 4.= 1/4.2 ones.)=0A= >I'm confused. So you are saying that if the Send-Q is empty when you parti= tion the=0A= >network, and the peer starts to send SYNs after the healing, FreeBSD respo= nds=0A= >with a challenge ACK which triggers the sending of a RST by Linux. This RS= T is=0A= >ignored multiple times.=0A= >Is that true? Even with my patch for the the bug I introduced?=0A= Yes and yes.=0A= Go take another look at linuxtofreenfs.pcap=0A= ("fetch https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap" if you do= n't=0A= already have it.)=0A= Look at packet #1949->2069. I use wireshark, but you'll have your favourite= .=0A= You'll see the "RST battle" that ends after=0A= 6minutes at packet#2069. If there is no 6minute timeout enabled in the=0A= server side krpc, then the battle just continues (I once let it run for abo= ut=0A= 30minutes before giving up). The 6minute timeout is not currently enabled= =0A= in main, etc.=0A= =0A= >What version of the kernel are you using?=0A= "main" dated Dec. 23, 2020 + your bugfix + assorted NFS patches that=0A= are not relevant + 2 small krpc related patches.=0A= --> The two small krpc related patches enable the 6minute timeout and=0A= add a soshutdown(..SHUT_WR) call when the 6minute timeout is=0A= triggered. These have no effect until the 6minutes is up and, withou= t=0A= them the "RTS battle" goes on forever.=0A= =0A= Add to the above a revert of r367492 and the RST battle goes away and thing= s=0A= behave as expected. The recovery happens quickly after the network is=0A= unpartitioned, with either 0 or 1 RSTs.=0A= =0A= rick=0A= ps: Once the irrelevant NFS patches make it into "main", I will upgrade to= =0A= main bits-de-jur for testing.=0A= =0A= Best regards=0A= Michael=0A= >=0A= > If Send-Q is non-empty when the network is partitioned, the battle will n= ot happen.=0A= >=0A= >>=0A= >> My understanding is that he needs this error indication when calling shu= tdown().=0A= > There are several ways the krpc notices that a TCP connection is no longe= r functional.=0A= > - An error return like EPIPE from either sosend() or soreceive().=0A= > - A return of 0 from soreceive() with no data (normal EOF from other end)= .=0A= > - A 6minute timeout on the server end, when no activity has occurred on t= he=0A= > connection. This timer is currently disabled for NFSv4.1/4.2 mounts in "= main",=0A= > but I enabled it for this testing, to stop the "RST battle goes on forev= er"=0A= > during testing. I am thinking of enabling it on "main", but this crude b= andaid=0A= > shouldn't be thought of as a "fix for the RST battle".=0A= >=0A= >>>=0A= >>> From what you describe, this is on writes, isn't it? (I'm asking, at th= e original problem that was fixed with r367492, occurs in the read path (dr= aining of ths so_rcv buffer in the upcall right away, which subsequently in= fluences the ACK sent by the stack).=0A= >>>=0A= >>> I only added the so_snd buffer after some discussion, if the WAKESOR sh= ouldn't have a symmetric equivalent on WAKESOW....=0A= >>>=0A= >>> Thus a partial backout (leaving the WAKESOR part inside, but reverting = the WAKESOW part) would still fix my initial problem about erraneous DSACKs= (which can also lead to extremely poor performance with Linux clients), bu= t possible address this issue...=0A= >>>=0A= >>> Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 = for the revert only on the so_snd upcall?=0A= > Since the krpc only uses receive upcalls, I don't see how reverting the s= end side would have=0A= > any effect?=0A= >=0A= >> Since the release of 13.0 is almost done, can we try to fix the issue in= stead of reverting the commit?=0A= > I think it has already shipped broken.=0A= > I don't know if an errata is possible, or if it will be broken until 13.1= .=0A= >=0A= > --> I am much more concerned with the otis@ stuck client problem than thi= s RST battle that only=0A= > occurs after a network partitioning, especially if it is 13.0 speci= fic.=0A= > I did this testing to try to reproduce Jason's stuck client (with c= onnection in CLOSE_WAIT)=0A= > problem, which I failed to reproduce.=0A= >=0A= > rick=0A= >=0A= > Rs: agree, a good understanding where the interaction btwn stack, socket = and in kernel tcp user breaks is needed;=0A= >=0A= >>=0A= >> If this doesn't help, some major surgery will be necessary to prevent NF= S sessions with SACK enabled, to transmit DSACKs...=0A= >=0A= > My understanding is that the problem is related to getting a local error = indication after=0A= > receiving a RST segment too late or not at all.=0A= >=0A= > Rs: but the move of the upcall should not materially change that; i don= =92t have a pc here to see if any upcall actually happens on rst...=0A= >=0A= > Best regards=0A= > Michael=0A= >>=0A= >>=0A= >>> I know from a printf that this happened, but whether it caused the RST = battle to not happen, I don't know.=0A= >>>=0A= >>> I can put r367492 back in and do more testing if you'd like, but I thin= k it probably needs to be reverted?=0A= >>=0A= >> Please, I don't quite understand why the exact timing of the upcall woul= d be that critical here...=0A= >>=0A= >> A comparison of the soxxx calls and errors between the "good" and the "b= ad" would be perfect. I don't know if this is easy to do though, as these c= alls appear to be scattered all around the RPC / NFS source paths.=0A= >>=0A= >>> This does not explain the original hung Linux client problem, but does = shed light on the RST war I could create by doing a network partitioning.= =0A= >>>=0A= >>> rick=0A= >>=0A= >> _______________________________________________=0A= >> freebsd-net@freebsd.org mailing list=0A= >> https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= =0A= >=0A= > _______________________________________________=0A= > freebsd-net@freebsd.org mailing list=0A= > https://lists.freebsd.org/mailman/listinfo/freebsd-net=0A= > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"=0A= =0A=