From nobody Mon Dec  4 18:59:31 2023
X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SkXyr26Gbz52h4l;
	Mon,  4 Dec 2023 18:59:32 +0000 (UTC)
	(envelope-from git@FreeBSD.org)
Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
	 client-signature RSA-PSS (4096 bits) client-digest SHA256)
	(Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4SkXyr0yDTz3TbF;
	Mon,  4 Dec 2023 18:59:32 +0000 (UTC)
	(envelope-from git@FreeBSD.org)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim;
	t=1701716372;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding;
	bh=+1ChECq2wpTYnzPE6jTmavId3FdwhdiHKJP/IM5U4YY=;
	b=qulhZNbof+bhGbXXVRQ1RdRIvG1FzbhdwD0H+IjnD7XI4I6sto4z8l6AC8hknNNHeyi/gy
	vt9vUEnvLoebkUOqy4yz5IT0xTMPshOcpsZvqCsCD1YMWtmgCPmJhSvpMP44o47ESdau86
	brPFfZgOeqUuIv07VGrW+z7gwq0RFSTDiTAUC54GxJODrXis/zhhGnO0qBl6nj/x+IhSDL
	VsWhuNiZrdAkXtTssZY1p1coJuMXbUaNLtUwRCy0iWjhaB5cYKlT6oU7sllFCRHEiOs4WP
	q+zq0tCvpbJ0JwMhXGt+vRfky2eMP8kEfEEmYrPYAcgyf3qdSGd53CPTMb93Ew==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org;
	s=dkim; t=1701716372;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding;
	bh=+1ChECq2wpTYnzPE6jTmavId3FdwhdiHKJP/IM5U4YY=;
	b=r59NwWGnROG244YXv5PN+xBhqs6vXTTsfu1VNvVBR+QjWuTteS0GeHrYTWoSlRRKxo8Dmt
	hOiMPelVCh+Lrp/C7THAW4W3H3NI4xEueSPx5WXQ/8+Slb6ilE1HwzodNRevqtLfGm11hz
	HZJht0/iN8wWi4v+UNcfb/YsD2rrHCJAx6028GXwp0IaJWsOsQTf2c7nJOjGLyuCZJgcp6
	SoVUBDRRiQfTyf14uE4UnDtCdFhiCI6+G0YZw92ajbYlHBxd682owSfLH6866G9wCXnIgG
	VWS8YUTJ8zaKZTj5aE4I+oBJlWIA6BZqmtA0WNBVYEFNQ9tLanO4KSM2lmB/Cg==
ARC-Authentication-Results: i=1;
	mx1.freebsd.org;
	none
ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1701716372; a=rsa-sha256; cv=none;
	b=Bw8HLOhfvU5N44EmU8H5XTZwdsY34ht/0SwKq4VTI3Ob5b0FMQJ5zzdRdiSxBwKqhEsKaj
	hKYoOAjScTAKYINOgaXQA0F5Ah4ZrTE/6vk/tXu5Ux/zhUQgvSzQLCH+bPiAeqpNrdRBI6
	8RabuvM3PJ8PAyOpZK0tVl+RhGhdjvHU+q1yKTgSWfrKKv/zWIC/bCM6xA9VYG3iyrkp3J
	YyBtlGm5Y9TymjoQEE1l707xaq/gvPRKMQprljfn8c4Z7BrwVpZszdnAIfvjguVYSk63eP
	hwdZepy0OcqQvv3blEqBKs/Eth+wcmQ38bSElcLuG7FaGkqTaJNfI9xEXZIR6A==
Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
	(Client did not present a certificate)
	by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4SkXyq6M5szdnV;
	Mon,  4 Dec 2023 18:59:31 +0000 (UTC)
	(envelope-from git@FreeBSD.org)
Received: from gitrepo.freebsd.org ([127.0.1.44])
	by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 3B4IxVvC043752;
	Mon, 4 Dec 2023 18:59:31 GMT
	(envelope-from git@gitrepo.freebsd.org)
Received: (from git@localhost)
	by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 3B4IxVcB043749;
	Mon, 4 Dec 2023 18:59:31 GMT
	(envelope-from git)
Date: Mon, 4 Dec 2023 18:59:31 GMT
Message-Id: <202312041859.3B4IxVcB043749@gitrepo.freebsd.org>
To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org,
        dev-commits-src-main@FreeBSD.org
From: Gleb Smirnoff <glebius@FreeBSD.org>
Subject: git: e3cbc572f154 - main - kern/subr_trap.c: repair the
  HPTS performance hack in userret()
List-Id: Commit messages for the main branch of the src repository <dev-commits-src-main.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main
List-Help: <mailto:dev-commits-src-main+help@freebsd.org>
List-Post: <mailto:dev-commits-src-main@freebsd.org>
List-Subscribe: <mailto:dev-commits-src-main+subscribe@freebsd.org>
List-Unsubscribe: <mailto:dev-commits-src-main+unsubscribe@freebsd.org>
Sender: owner-dev-commits-src-main@freebsd.org
X-BeenThere: dev-commits-src-main@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Git-Committer: glebius
X-Git-Repository: src
X-Git-Refname: refs/heads/main
X-Git-Reftype: branch
X-Git-Commit: e3cbc572f1541fdc18be9971d23e210d5018e662
Auto-Submitted: auto-generated

The branch main has been updated by glebius:

URL: https://cgit.FreeBSD.org/src/commit/?id=e3cbc572f1541fdc18be9971d23e210d5018e662

commit e3cbc572f1541fdc18be9971d23e210d5018e662
Author:     Gleb Smirnoff <glebius@FreeBSD.org>
AuthorDate: 2023-12-04 18:19:46 +0000
Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
CommitDate: 2023-12-04 18:19:46 +0000

    kern/subr_trap.c: repair the HPTS performance hack in userret()
    
    It wasn't functional as subr_trap.c doesn't include opt_inet.h.  Put a
    better comment provided by gallatin@ in place of the old one.  The idea
    is to use userret() as a cheap place to call a soft clock.  This approach
    saves CPU on busy machines and saves power on idle machines.
    An alternative would be to constantly schedule callouts.  Running with
    neither callouts nor the soft clock ruins HPTS precision.
    
    Reviewed by:            tuexen, rrs
    Differential Revision:  https://reviews.freebsd.org/D42860
---
 sys/kern/subr_trap.c   | 20 ++++++++++++--------
 sys/netinet/tcp_hpts.h |  1 -
 sys/netinet/tcp_lro.c  |  4 +---
 sys/sys/systm.h        |  6 ++++++
 4 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/sys/kern/subr_trap.c b/sys/kern/subr_trap.c
index 8720d9f71c1c..e9a16cd0b36e 100644
--- a/sys/kern/subr_trap.c
+++ b/sys/kern/subr_trap.c
@@ -74,6 +74,8 @@
 #include <sys/epoch.h>
 #endif
 
+void	(*tcp_hpts_softclock)(void);
+
 /*
  * Define the code needed before returning to user mode, for trap and
  * syscall.
@@ -125,16 +127,18 @@ userret(struct thread *td, struct trapframe *frame)
 	if (PMC_THREAD_HAS_SAMPLES(td))
 		PMC_CALL_HOOK(td, PMC_FN_THR_USERRET, NULL);
 #endif
-#ifdef TCPHPTS
 	/*
-	 * @gallatin is adament that this needs to go here, I
-	 * am not so sure. Running hpts is a lot like
-	 * a lro_flush() that happens while a user process
-	 * is running. But he may know best so I will go
-	 * with his view of accounting. :-)
+	 * Calling tcp_hpts_softclock() here allows us to avoid frequent,
+	 * expensive callouts that trash the cache and lead to a much higher
+	 * number of interrupts and context switches.  Testing on busy web
+	 * servers at Netflix has shown that this improves CPU use by 7% over
+	 * relying only on callouts to drive HPTS, and also results in idle
+	 * power savings on mostly idle servers.
+	 * This was inspired by the paper "Soft Timers: Efficient Microsecond
+	 * Software Timer Support for Network Processing"
+	 * by Mohit Aron and Peter Druschel.
 	 */
-	tcp_run_hpts();
-#endif
+	tcp_hpts_softclock();
 	/*
 	 * Let the scheduler adjust our priority etc.
 	 */
diff --git a/sys/netinet/tcp_hpts.h b/sys/netinet/tcp_hpts.h
index 8ca21daf60de..7eb1b2e08cb4 100644
--- a/sys/netinet/tcp_hpts.h
+++ b/sys/netinet/tcp_hpts.h
@@ -152,7 +152,6 @@ void __tcp_set_hpts(struct tcpcb *tp, int32_t line);
 
 void tcp_set_inp_to_drop(struct inpcb *inp, uint16_t reason);
 
-extern void (*tcp_hpts_softclock)(void);
 void tcp_lro_hpts_init(void);
 
 extern int32_t tcp_min_hptsi_time;
diff --git a/sys/netinet/tcp_lro.c b/sys/netinet/tcp_lro.c
index 255e543ae21d..921d28f82517 100644
--- a/sys/netinet/tcp_lro.c
+++ b/sys/netinet/tcp_lro.c
@@ -89,7 +89,6 @@ SYSCTL_NODE(_net_inet_tcp, OID_AUTO, lro,  CTLFLAG_RW | CTLFLAG_MPSAFE, 0,
 
 long tcplro_stacks_wanting_mbufq;
 int	(*tcp_lro_flush_tcphpts)(struct lro_ctrl *lc, struct lro_entry *le);
-void	(*tcp_hpts_softclock)(void);
 
 counter_u64_t tcp_inp_lro_direct_queue;
 counter_u64_t tcp_inp_lro_wokeup_queue;
@@ -1262,8 +1261,7 @@ tcp_lro_flush_all(struct lro_ctrl *lc)
 done:
 	/* flush active streams */
 	tcp_lro_rx_done(lc);
-	if (tcp_hpts_softclock != NULL)
-		tcp_hpts_softclock();
+	tcp_hpts_softclock();
 	lc->lro_mbuf_count = 0;
 }
 
diff --git a/sys/sys/systm.h b/sys/sys/systm.h
index 2532bc3d9926..06d40481375f 100644
--- a/sys/sys/systm.h
+++ b/sys/sys/systm.h
@@ -378,6 +378,12 @@ void	cpu_et_frequency(struct eventtimer *et, uint64_t newfreq);
 extern int	cpu_disable_c2_sleep;
 extern int	cpu_disable_c3_sleep;
 
+extern void	(*tcp_hpts_softclock)(void);
+#define	tcp_hpts_softclock()	do {					\
+		if (tcp_hpts_softclock != NULL)				\
+			tcp_hpts_softclock();				\
+} while (0)
+
 char	*kern_getenv(const char *name);
 void	freeenv(char *env);
 int	getenv_int(const char *name, int *data);