From nobody Sun Feb 23 16:20:46 2025 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Z18HL59lMz5p1C5; Sun, 23 Feb 2025 16:20:46 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Z18HL4hXPz3F1W; Sun, 23 Feb 2025 16:20:46 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1740327646; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=OvfFMqQuG6SVrm2NG2Mbr/3Hv5yatJZJL9UADmRq1fM=; b=IKvxgulej5+2GPhq+3Lr/t0XGkzu7ARRnSe0LbKIANgGFTx/srkbwhkYywB5gZaic+eDDk tXTvureADiP45CZScsDqD4mtlhA5d4S08/caMahbJitlc2czgtSq+EufysndbU0PcEbgvg m6xGxBaqn5UeHnK76t5Ws2Y+2fFfPtKBDJOSWxvKauuDPNj+HyrrPGiPzp9TEpTFqCmGcL jfaJTdRg1Fol/tYbUA7mI/x4iaxPD9YdHZXl+7EQz/uXcpBy5yFE3q74dclt5yNwT40Xcm BiXQPwQuWL8h0M82vFNolRRMWlM9TjJHgER/sb+21iOaDWG67TC6AW1EmdJQ0Q== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1740327646; a=rsa-sha256; cv=none; b=Dxo+oagkYmVSBKrc6BydsliEB9euiaclelcIyakvN+W1PHwpH5RfnKDfkEb33Y7eU8Hg4p Oxm5ySishTUKmLgpDfHziBLj3VasHeU9ZrGkb0HrOzeICuyrSfsvixb3Ywn55uhkpvOkf0 CAKMzD7OPJw4boWkp9C/2FnDLHfOaXSHhQxJo380hXO5se2+d2Li3gv8QvZxGACO0en9yW RetoTQNpEexrhH/TFyo7vi99Yp9zflQND4P4kQrkZWrMGB6ITyYNPOpzxu+SZwR/6iOneq fyqNATgYmW+CjoR6FIRKZzHDqBSNtPEHk4gE6Vx/LngVHJ231pZPb9u73CWCOg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1740327646; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=OvfFMqQuG6SVrm2NG2Mbr/3Hv5yatJZJL9UADmRq1fM=; b=HeDJjpW/fNHetUO6SZO1A6rU3mdhwIrZuK0pwTzPYmQcJv5KbTDnE7WV4BrkWovnvzOgyx A7ltIWgeZ4R26LvDmMd96mdVsvI//wDgSzgVizxflUd5UdObR+FicfJ4FzjI+CvReRqA5Z zUlkGDGSgpF6/9zHNHvLoQdDvOb7/6v9ZH055Ucb9YOx/vrBPLF5EJ+AQWYPL/PJnRngVS Zh8Hb0Pxn8hX3SpRBrsTaM6H608RNsm6Lm0aKdcuIlPtbZ6p3JKJNW+r0kMaFjGm8z1VTo fH95XdxJ1oV+VqTtcmwE3Fpe5N93+MifK759KbUwOHxdRcswPDhljElJmGlYAw== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Z18HL4HD7z122n; Sun, 23 Feb 2025 16:20:46 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 51NGKkEj079971; Sun, 23 Feb 2025 16:20:46 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 51NGKkl6079968; Sun, 23 Feb 2025 16:20:46 GMT (envelope-from git) Date: Sun, 23 Feb 2025 16:20:46 GMT Message-Id: <202502231620.51NGKkl6079968@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Mark Johnston Subject: git: 8b3d2c19d369 - main - inpcb: Fix reuseport lbgroup array resizing List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: markj X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 8b3d2c19d3691f29d4e86c73bc11491ae3fbfaec Auto-Submitted: auto-generated The branch main has been updated by markj: URL: https://cgit.FreeBSD.org/src/commit/?id=8b3d2c19d3691f29d4e86c73bc11491ae3fbfaec commit 8b3d2c19d3691f29d4e86c73bc11491ae3fbfaec Author: Mark Johnston AuthorDate: 2025-02-23 16:20:12 +0000 Commit: Mark Johnston CommitDate: 2025-02-23 16:20:12 +0000 inpcb: Fix reuseport lbgroup array resizing in_pcblisten() moves an inpcb from the per-group list into the array, at which point it becomes visible to inpcb lookups in the datapath. It assumes that there is space in the array for this, but that's not guaranteed, since in_pcbinslbgrouphash() doesn't reserve space in the array if the inpcb isn't associated with a listening socket. We could resize the array in in_pcblisten(), but that would introduce a failure case where there currently is none. Instead, keep track of the number of pending inpcbs as well, and modify in_pcbinslbgrouphash() to reserve space for each pending (i.e., not-yet-listening) inpcb. Add a regression test. Reviewed by: glebius Reported by: netchild Fixes: 7cbb6b6e28db ("inpcb: Close some SO_REUSEPORT_LB races, part 2") Differential Revision: https://reviews.freebsd.org/D49100 --- sys/netinet/in_pcb.c | 7 ++++- sys/netinet/in_pcb_var.h | 1 + tests/sys/netinet/so_reuseport_lb_test.c | 46 ++++++++++++++++++++++++++++++++ 3 files changed, 53 insertions(+), 1 deletion(-) diff --git a/sys/netinet/in_pcb.c b/sys/netinet/in_pcb.c index 9d174dce9024..1d9cc1866e15 100644 --- a/sys/netinet/in_pcb.c +++ b/sys/netinet/in_pcb.c @@ -339,6 +339,7 @@ in_pcblbgroup_insert(struct inpcblbgroup *grp, struct inpcb *inp) * lookups until listen() has been called. */ LIST_INSERT_HEAD(&grp->il_pending, inp, inp_lbgroup_list); + grp->il_pendcnt++; } else { grp->il_inp[grp->il_inpcnt] = inp; @@ -375,6 +376,8 @@ in_pcblbgroup_resize(struct inpcblbgrouphead *hdr, CK_LIST_INSERT_HEAD(hdr, grp, il_list); LIST_SWAP(&old_grp->il_pending, &grp->il_pending, inpcb, inp_lbgroup_list); + grp->il_pendcnt = old_grp->il_pendcnt; + old_grp->il_pendcnt = 0; in_pcblbgroup_free(old_grp); return (grp); } @@ -435,7 +438,7 @@ in_pcbinslbgrouphash(struct inpcb *inp, uint8_t numa_domain) return (ENOBUFS); in_pcblbgroup_insert(grp, inp); CK_LIST_INSERT_HEAD(hdr, grp, il_list); - } else if (grp->il_inpcnt == grp->il_inpsiz) { + } else if (grp->il_inpcnt + grp->il_pendcnt == grp->il_inpsiz) { if (grp->il_inpsiz >= INPCBLBGROUP_SIZMAX) { if (ratecheck(&lastprint, &interval)) printf("lb group port %d, limit reached\n", @@ -499,6 +502,7 @@ in_pcbremlbgrouphash(struct inpcb *inp) LIST_FOREACH(inp1, &grp->il_pending, inp_lbgroup_list) { if (inp == inp1) { LIST_REMOVE(inp, inp_lbgroup_list); + grp->il_pendcnt--; inp->inp_flags &= ~INP_INLBGROUP; return; } @@ -1503,6 +1507,7 @@ in_pcblisten(struct inpcb *inp) INP_HASH_WLOCK(pcbinfo); grp = in_pcblbgroup_find(inp); LIST_REMOVE(inp, inp_lbgroup_list); + grp->il_pendcnt--; in_pcblbgroup_insert(grp, inp); INP_HASH_WUNLOCK(pcbinfo); } diff --git a/sys/netinet/in_pcb_var.h b/sys/netinet/in_pcb_var.h index e2b0ca386e7f..32fdbced175c 100644 --- a/sys/netinet/in_pcb_var.h +++ b/sys/netinet/in_pcb_var.h @@ -82,6 +82,7 @@ struct inpcblbgroup { #define il6_laddr il_dependladdr.id6_addr uint32_t il_inpsiz; /* max count in il_inp[] (h) */ uint32_t il_inpcnt; /* cur count in il_inp[] (h) */ + uint32_t il_pendcnt; /* cur count in il_pending (h) */ struct inpcb *il_inp[]; /* (h) */ }; diff --git a/tests/sys/netinet/so_reuseport_lb_test.c b/tests/sys/netinet/so_reuseport_lb_test.c index 09d8e0ce8f83..aaadaead5e23 100644 --- a/tests/sys/netinet/so_reuseport_lb_test.c +++ b/tests/sys/netinet/so_reuseport_lb_test.c @@ -433,6 +433,51 @@ ATF_TC_BODY(double_listen_ipv6, tc) ATF_REQUIRE_MSG(error == 0, "close() failed: %s", strerror(errno)); } +/* + * Try binding many sockets to the same lbgroup without calling listen(2) on + * them. + */ +ATF_TC_WITHOUT_HEAD(bind_without_listen); +ATF_TC_BODY(bind_without_listen, tc) +{ + const int nsockets = 100; + struct sockaddr_in sin; + socklen_t socklen; + int error, s, s2[nsockets]; + + s = lb_listen_socket(PF_INET, 0); + + memset(&sin, 0, sizeof(sin)); + sin.sin_len = sizeof(sin); + sin.sin_family = AF_INET; + sin.sin_port = htons(0); + sin.sin_addr.s_addr = htonl(INADDR_LOOPBACK); + error = bind(s, (struct sockaddr *)&sin, sizeof(sin)); + ATF_REQUIRE_MSG(error == 0, "bind() failed: %s", strerror(errno)); + + socklen = sizeof(sin); + error = getsockname(s, (struct sockaddr *)&sin, &socklen); + ATF_REQUIRE_MSG(error == 0, "getsockname() failed: %s", + strerror(errno)); + + for (int i = 0; i < nsockets; i++) { + s2[i] = lb_listen_socket(PF_INET, 0); + error = bind(s2[i], (struct sockaddr *)&sin, sizeof(sin)); + ATF_REQUIRE_MSG(error == 0, "bind() failed: %s", strerror(errno)); + } + for (int i = 0; i < nsockets; i++) { + error = listen(s2[i], 1); + ATF_REQUIRE_MSG(error == 0, "listen() failed: %s", strerror(errno)); + } + for (int i = 0; i < nsockets; i++) { + error = close(s2[i]); + ATF_REQUIRE_MSG(error == 0, "close() failed: %s", strerror(errno)); + } + + error = close(s); + ATF_REQUIRE_MSG(error == 0, "close() failed: %s", strerror(errno)); +} + ATF_TP_ADD_TCS(tp) { ATF_TP_ADD_TC(tp, basic_ipv4); @@ -440,6 +485,7 @@ ATF_TP_ADD_TCS(tp) ATF_TP_ADD_TC(tp, concurrent_add); ATF_TP_ADD_TC(tp, double_listen_ipv4); ATF_TP_ADD_TC(tp, double_listen_ipv6); + ATF_TP_ADD_TC(tp, bind_without_listen); return (atf_no_error()); }