From nobody Fri Jul 25 20:06:48 2025 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4bpf613L8Qz62HRM; Fri, 25 Jul 2025 20:06:49 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4bpf610L0lz47S5; Fri, 25 Jul 2025 20:06:49 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1753474009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=0ZTyyWHYFSBHB3lbwO0xN2JOxiuGsGkjPCatJcgg7yY=; b=N8RbwA0SMEBlkznBpId3kwJYnedZ9zcwFeZIgf4cXx2mv8YzfImTKTJ9/eDfZ17MOsx5he 5TN5Dp6U/Dvp0qwGvDNiMWlNbRInBYNulwBhQ6lylHUakDGrXdotJ+kA4QYCUbMNvJc+dR osnVrrN0qRKLVl5iYwnfOU4qR54LYQzKe2dc/vNUDKsrSjnk2b0NJx7shBHVaVvMA7kZT1 ote4wBXvnapV3LSN8cA7VLDzOh+I8aOA1iI9mU6Y0pZhd5NDpi2k+ayZt3Hr7wfkX5GR9a SF9NpGDb4trcRIas4zVlBx4/0e+cA6r2LC8VxNEfhN0eeTXYfIP28LYGSWDhBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1753474009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=0ZTyyWHYFSBHB3lbwO0xN2JOxiuGsGkjPCatJcgg7yY=; b=jksV0J7hIuLYMB/q04gMB1sUoSpWYL+I6zVj+PiUKBWfpSmVhVEyrvv4yBT/HBPrqQg3Eq hgX7mEhLRsMjyczBzI/55vwuyIJoxQxaUoiKRthvKNF00AQaWgCHFTbNKvEVudiMTlnPTW 6PbvrD7U5m7eHB7SDEbOsUQUROku7v2q9eVlrwhxnI5FQ+aQcxAiXXY45+Lp1JgI2W4noB iSsRwKnQ4XNK+KqwYHgEduHBZpKbjzLJvifXauRxulZiQ28gCC0z4y2hNvK0WLSD2SD1/5 ByZn8x4IdCLpGdxw//e6RdUzlqRx/1IVn0UyPsJlGc+2sb6HBBlZlLbBkAK2tQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1753474009; a=rsa-sha256; cv=none; b=bjjzTc1VWYFcW17c5+j31ZbjpOy2wdrjMlmelqAMo1V3txPT20HJFNT9gB+O94v5a4/87i sOk4aHihH0gQLCqIvh+3rcbMMU38Y9l5io9FELPtptygj5g8F8p0rpofPsR/RYDTJIzpSE L3IKSbLqamUe4M2W193MoaRWJaBwEeq2IEN3Q5Acxdgx03nK1iD6qD9aL2iIqfGVJgFBIV OXcHbEwtoFA69O5HtV6ipRiq491cn6toUtzZRuI/paNPKccZZ0CpABYi6RF0VlRspQrjKu hLYW2n615rAwMv6DElpF0b7Hf9xKRFfqMlNZdWeWimHlqDUjnhNrEw7orSMjfg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4bpf6073P6zyZX; Fri, 25 Jul 2025 20:06:48 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 56PK6mFZ094108; Fri, 25 Jul 2025 20:06:48 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 56PK6m1a094104; Fri, 25 Jul 2025 20:06:48 GMT (envelope-from git) Date: Fri, 25 Jul 2025 20:06:48 GMT Message-Id: <202507252006.56PK6m1a094104@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Gleb Smirnoff Subject: git: f2c2ed7df313 - main - sendfile: don't hack sb_lowat for sockets that manage the watermark List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: glebius X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: f2c2ed7df313f641451ca5a468f658fd350aae52 Auto-Submitted: auto-generated The branch main has been updated by glebius: URL: https://cgit.FreeBSD.org/src/commit/?id=f2c2ed7df313f641451ca5a468f658fd350aae52 commit f2c2ed7df313f641451ca5a468f658fd350aae52 Author: Gleb Smirnoff AuthorDate: 2025-07-25 20:05:56 +0000 Commit: Gleb Smirnoff CommitDate: 2025-07-25 20:05:56 +0000 sendfile: don't hack sb_lowat for sockets that manage the watermark In the sendfile(2) we carry an old hack (originating from d99b0dd2c5297) to help dumb benchmarks and applications to achieve higher performance. We would modify low watermark on the socket send buffer to avoid socket being reported as writable too early, which would result in lots of small writes. Skip that hack for applications that do setsockopt(SO_SNDLOWAT) or that register the socket in kevent(2) with NOTE_LOWAT feature. First, we don't want the hack to rewrite the watermark value explicitly specified by the user. Second, in certain cases that can lead to real performance regressions. A kevent(2) with NOTE_LOWAT would report socket as writable, but then sendfile(2) would write 0 bytes and return EAGAIN. The change also disables the hack for unix(4) sockets, leaving only TCP. Reviewed by: rrs Differential Revision: https://reviews.freebsd.org/D50581 --- sys/kern/kern_sendfile.c | 11 +++++++---- sys/kern/uipc_sockbuf.c | 1 + sys/kern/uipc_socket.c | 6 +++++- sys/netinet/tcp_usrreq.c | 2 +- sys/sys/sockbuf.h | 2 +- 5 files changed, 15 insertions(+), 7 deletions(-) diff --git a/sys/kern/kern_sendfile.c b/sys/kern/kern_sendfile.c index 35b258e68701..8438298afc0e 100644 --- a/sys/kern/kern_sendfile.c +++ b/sys/kern/kern_sendfile.c @@ -698,10 +698,13 @@ sendfile_wait_generic(struct socket *so, off_t need, int *space) */ error = 0; SOCK_SENDBUF_LOCK(so); - if (so->so_snd.sb_lowat < so->so_snd.sb_hiwat / 2) - so->so_snd.sb_lowat = so->so_snd.sb_hiwat / 2; - if (so->so_snd.sb_lowat < PAGE_SIZE && so->so_snd.sb_hiwat >= PAGE_SIZE) - so->so_snd.sb_lowat = PAGE_SIZE; + if (so->so_snd.sb_flags & SB_AUTOLOWAT) { + if (so->so_snd.sb_lowat < so->so_snd.sb_hiwat / 2) + so->so_snd.sb_lowat = so->so_snd.sb_hiwat / 2; + if (so->so_snd.sb_lowat < PAGE_SIZE && + so->so_snd.sb_hiwat >= PAGE_SIZE) + so->so_snd.sb_lowat = PAGE_SIZE; + } retry_space: if (so->so_snd.sb_state & SBS_CANTSENDMORE) { error = EPIPE; diff --git a/sys/kern/uipc_sockbuf.c b/sys/kern/uipc_sockbuf.c index ec00878cd9a5..ffaa9b800592 100644 --- a/sys/kern/uipc_sockbuf.c +++ b/sys/kern/uipc_sockbuf.c @@ -779,6 +779,7 @@ sbsetopt(struct socket *so, struct sockopt *sopt) * high-water. */ *lowat = (cc > *hiwat) ? *hiwat : cc; + *flags &= ~SB_AUTOLOWAT; break; } diff --git a/sys/kern/uipc_socket.c b/sys/kern/uipc_socket.c index 6c9eb7139cd1..4e8c179acee9 100644 --- a/sys/kern/uipc_socket.c +++ b/sys/kern/uipc_socket.c @@ -1211,7 +1211,8 @@ solisten_clone(struct socket *head) so->so_rcv.sb_timeo = head->sol_sbrcv_timeo; so->so_snd.sb_timeo = head->sol_sbsnd_timeo; so->so_rcv.sb_flags = head->sol_sbrcv_flags & SB_AUTOSIZE; - so->so_snd.sb_flags = head->sol_sbsnd_flags & SB_AUTOSIZE; + so->so_snd.sb_flags = head->sol_sbsnd_flags & + (SB_AUTOSIZE | SB_AUTOLOWAT); if ((so->so_proto->pr_flags & PR_SOCKBUF) == 0) { so->so_snd.sb_mtx = &so->so_snd_mtx; so->so_rcv.sb_mtx = &so->so_rcv_mtx; @@ -4514,6 +4515,9 @@ sokqfilter_generic(struct socket *so, struct knote *kn) SOCK_BUF_LOCK(so, which); knlist_add(knl, kn, 1); sb->sb_flags |= SB_KNOTE; + if ((kn->kn_sfflags & NOTE_LOWAT) && + (sb->sb_flags & SB_AUTOLOWAT)) + sb->sb_flags &= ~SB_AUTOLOWAT; SOCK_BUF_UNLOCK(so, which); } SOCK_UNLOCK(so); diff --git a/sys/netinet/tcp_usrreq.c b/sys/netinet/tcp_usrreq.c index 70e4c04b79e5..98c934955121 100644 --- a/sys/netinet/tcp_usrreq.c +++ b/sys/netinet/tcp_usrreq.c @@ -164,7 +164,7 @@ tcp_usr_attach(struct socket *so, int proto, struct thread *td) goto out; so->so_rcv.sb_flags |= SB_AUTOSIZE; - so->so_snd.sb_flags |= SB_AUTOSIZE; + so->so_snd.sb_flags |= (SB_AUTOLOWAT | SB_AUTOSIZE); error = in_pcballoc(so, &V_tcbinfo); if (error) goto out; diff --git a/sys/sys/sockbuf.h b/sys/sys/sockbuf.h index 7f6234ade6f4..2fed44bc9825 100644 --- a/sys/sys/sockbuf.h +++ b/sys/sys/sockbuf.h @@ -40,7 +40,7 @@ #define SB_SEL 0x08 /* someone is selecting */ #define SB_ASYNC 0x10 /* ASYNC I/O, need signals */ #define SB_UPCALL 0x20 /* someone wants an upcall */ -/* was SB_NOINTR 0x40 */ +#define SB_AUTOLOWAT 0x40 /* sendfile(2) may autotune sb_lowat */ #define SB_AIO 0x80 /* AIO operations queued */ #define SB_KNOTE 0x100 /* kernel note attached */ #define SB_NOCOALESCE 0x200 /* don't coalesce new data into existing mbufs */