From nobody Mon Mar 18 19:42:42 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Tz4yc6SHbz5DrRX; Mon, 18 Mar 2024 19:43:04 +0000 (UTC) (envelope-from gallatin@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Tz4yc5mjGz4CNN; Mon, 18 Mar 2024 19:43:04 +0000 (UTC) (envelope-from gallatin@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710790984; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Q09t9S3BPqCd2J+f0CbsV0jK2W3DmzGtvs+ffQMpKac=; b=I8/m2ozO2d3OtmpIjXgFPfpAXa1bybZ/KDOMR3BDiQ0s0+SO1lPyi+Nis4yMOCFY4uV/6/ YB9ZVyw9kTqwLGDE6lKFkKS0l+i+8b1rsNAn044Q83chS6epxhITsHCeyVAUAixKA2fTfR OMt49rBtpQb/cKjL2KSdt/obCVsNWNS7rt9RntxRuKBftXgDNTkB2bvqnXqld9H/ZOK0E/ j231SyP7PnySE/vgZIB9q7BU/JU34dw8Un6n6+x0En28KGMw4m6M6cCRv5q4mRWFHmiexN ObZHQk+3rvZZyC9LN2vxj2ugGGfV78QcRt7Wt+neFcSfdUWQNmBUo5+GsdDgRg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1710790984; a=rsa-sha256; cv=none; b=DBJIJL55XWpi2xUKQfZYFCS4khZ5/b4tol0FnKSbOQbsYEzC2vCJ06GJLVRiD0itRMINh7 rTsTkk1zHjlAoBnVsQ4e1AY35PEvB9BzU/hiqzRyzy614rfOH0DrEel81kj7/wd32R8wWX BxVYUmW6N0qQmgdVNfB8yqlU/42LJA/AO5fn9e9HKl/3z7R0hDtRQN/CMYlZP+5XpLC8T7 XlkKa8bMAmJQ8G6DMUi99N7cyGF0ogv20Ib9fXoWgroyFga9Cd9LsY/lMxIdceJ0Ub4jgH 9HjG/Q+CVv/trSLsW27BPsU0QWCX2o7SIEBumaObp651kpCyDICJ6C08z51oeQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710790984; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Q09t9S3BPqCd2J+f0CbsV0jK2W3DmzGtvs+ffQMpKac=; b=SmUyykqshjgTYeZ6APoFuG3Y8GOC7pYqtNFBA5cQkXUZh1RoOiJ+Qy2iQiYOaRb1lHAZr8 Ouf5N5FwtRbS2FeCEDZwETwJpe0I/Z2q697w2gmGeUIFMgQ8FSoByuvmOvMgaRy/EwYQzu rYmoj2YpErBUieEQ1DsXquNevI5V4vB3KBJLpYw4Job7gXOepvVJLfS/78r8Ka4PgpO/II 5Kp0TUVHX6e2PdFVB2Mc6kJdKDDz16rPDufea3ZgdlmT6sLMMsKzP/OKyZGduCuUn2Rpd1 Qr1vSSmc7tP/8PZjcP/LsW5jVp5u5SeY+6VL2iI9nhRDQ8k5dKDeYfAIIYLE1w== Received: from fauth1-smtp.messagingengine.com (fauth1-smtp.messagingengine.com [103.168.172.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: gallatin) by smtp.freebsd.org (Postfix) with ESMTPSA id 4Tz4yc3kdxzPj4; Mon, 18 Mar 2024 19:43:04 +0000 (UTC) (envelope-from gallatin@freebsd.org) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfauth.nyi.internal (Postfix) with ESMTP id 027CB120006B; Mon, 18 Mar 2024 15:43:03 -0400 (EDT) Received: from imap53 ([10.202.2.103]) by compute5.internal (MEProxy); Mon, 18 Mar 2024 15:43:04 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrkeejgdduvdekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvfevufgtsegrtderreerreejnecuhfhrohhmpedfffhr vgifucfirghllhgrthhinhdfuceoghgrlhhlrghtihhnsehfrhgvvggsshgurdhorhhgqe enucggtffrrghtthgvrhhnpefgteeluefggeevheefheejgedtvdelheejfffhgeeuhfel keevheeiieeuhfefieenucffohhmrghinhepmhhpihdqshifshdrohhrghenucevlhhush htvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehgrghllhgrthhinhdo mhgvshhmthhprghuthhhphgvrhhsohhnrghlihhthidqudeffeehledvvdduiedqvdelhe dtgedukeegqdhgrghllhgrthhinheppehfrhgvvggsshgurdhorhhgsehfrghsthhmrghi lhdrtghomh X-ME-Proxy: Feedback-ID: i41414658:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id B879D36400BB; Mon, 18 Mar 2024 15:43:03 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-300-gdee1775a43-fm-20240315.001-gdee1775a List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Message-Id: <38c54399-6c96-44d8-a3a2-3cc1bfbe50c2@app.fastmail.com> In-Reply-To: References: <6e795e9c-8de4-4e02-9a96-8fabfaa4e66f@app.fastmail.com> <6047C8EF-B1B0-4286-93FA-AA38F8A18656@karels.net> <8031cd99-ded8-4b06-93b3-11cc729a8b2c@app.fastmail.com> Date: Mon, 18 Mar 2024 15:42:42 -0400 From: "Drew Gallatin" To: "Konstantin Belousov" Cc: "Mike Karels" , tuexen , "Nuno Teixeira" , garyj@gmx.de, current@freebsd.org, net@freebsd.org, "Randall Stewart" Subject: Re: Request for Testing: TCP RACK Content-Type: multipart/alternative; boundary=49341a7599d5444a8bbcc6b7abbe0677 --49341a7599d5444a8bbcc6b7abbe0677 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable No. The goal is to run on every return to userspace for every thread. Drew On Mon, Mar 18, 2024, at 3:41 PM, Konstantin Belousov wrote: > On Mon, Mar 18, 2024 at 03:13:11PM -0400, Drew Gallatin wrote: > > I got the idea from > > https://people.mpi-sws.org/~druschel/publications/soft-timers-tocs.p= df > > The gist is that the TCP pacing stuff needs to run frequently, and > > rather than run it out of a clock interrupt, its more efficient to r= un > > it out of a system call context at just the point where we return to > > userspace and the cache is trashed anyway. The current implementation > > is fine for our workload, but probably not idea for a generic system. > > Especially one where something is banging on system calls. > > > > Ast's could be the right tool for this, but I'm super unfamiliar with > > them, and I can't find any docs on them. > > > > Would ast_register(0, ASTR_UNCOND, 0, func) be roughly equivalent to > > what's happening here? > This call would need some AST number added, and then it registers the > ast to run on next return to userspace, for the current thread. >=20 > Is it enough? > > > > Drew >=20 > >=20 > > On Mon, Mar 18, 2024, at 2:33 PM, Konstantin Belousov wrote: > > > On Mon, Mar 18, 2024 at 07:26:10AM -0500, Mike Karels wrote: > > > > On 18 Mar 2024, at 7:04, tuexen@freebsd.org wrote: > > > >=20 > > > > >> On 18. Mar 2024, at 12:42, Nuno Teixeira wrote: > > > > >> > > > > >> Hello all! > > > > >> > > > > >> It works just fine! > > > > >> System performance is OK. > > > > >> Using patch on main-n268841-b0aaf8beb126(-dirty). > > > > >> > > > > >> --- > > > > >> net.inet.tcp.functions_available: > > > > >> Stack D Alias = PCB count > > > > >> freebsd freebsd = 0 > > > > >> rack * rack = 38 > > > > >> --- > > > > >> > > > > >> It would be so nice that we can have a sysctl tunnable for th= is patch > > > > >> so we could do more tests without recompiling kernel. > > > > > Thanks for testing! > > > > > > > > > > @gallatin: can you come up with a patch that is acceptable for= Netflix > > > > > and allows to mitigate the performance regression. > > > >=20 > > > > Ideally, tcphpts could enable this automatically when it starts = to be > > > > used (enough?), but a sysctl could select auto/on/off. > > > There is already a well-known mechanism to request execution of the > > > specific function on return to userspace, namely AST. The differe= nce > > > with the current hack is that the execution is requested for one c= allback > > > in the context of the specific thread. > > >=20 > > > Still, it might be worth a try to use it; what is the reason to hi= t a thread > > > that does not do networking, with TCP processing? > > >=20 > > > >=20 > > > > Mike > > > >=20 > > > > > Best regards > > > > > Michael > > > > >> > > > > >> Thanks all! > > > > >> Really happy here :) > > > > >> > > > > >> Cheers, > > > > >> > > > > >> Nuno Teixeira escreveu (domingo, 17/03/= 2024 =C3=A0(s) 20:26): > > > > >>> > > > > >>> Hello, > > > > >>> > > > > >>>> I don't have the full context, but it seems like the compla= int is a performance regression in bonnie++ and perhaps other things whe= n tcp_hpts is loaded, even when it is not used. Is that correct? > > > > >>>> > > > > >>>> If so, I suspect its because we drive the tcp_hpts_softcloc= k() routine from userret(), in order to avoid tons of timer interrupts a= nd context switches. To test this theory, you could apply a patch like: > > > > >>> > > > > >>> It's affecting overall system performance, bonnie was just a= way to > > > > >>> get some numbers to compare. > > > > >>> > > > > >>> Tomorrow I will test patch. > > > > >>> > > > > >>> Thanks! > > > > >>> > > > > >>> -- > > > > >>> Nuno Teixeira > > > > >>> FreeBSD Committer (ports) > > > > >> > > > > >> > > > > >> > > > > >> --=20 > > > > >> Nuno Teixeira > > > > >> FreeBSD Committer (ports) > > > >=20 > > >=20 >=20 --49341a7599d5444a8bbcc6b7abbe0677 Content-Type: text/html;charset=utf-8 Content-Transfer-Encoding: quoted-printable
No.  The g= oal is to run on every return to userspace for every thread.

Drew

On Mon, Mar 18, 2024= , at 3:41 PM, Konstantin Belousov wrote:
On Mon, Mar 18, 2024 at 03:13:11PM -0400, = Drew Gallatin wrote:
> I got the idea from
https://people.mpi-sws.org/~druschel/publications= /soft-timers-tocs.pdf
> The gist is that the TCP pa= cing stuff needs to run frequently, and
> rather than r= un it out of a clock interrupt, its more efficient to run
= > it out of a system call context at just the point where we return t= o
> userspace and the cache is trashed anyway. The curr= ent implementation
> is fine for our workload, but prob= ably not idea for a generic system.
> Especially one wh= ere something is banging on system calls.
>
> Ast's could be the right tool for this, but I'm super unfamiliar= with
> them, and I can't find any docs on them.
>
> Would ast_register(0, ASTR_UNCOND, 0, fu= nc) be roughly equivalent to
> what's happening here?
This call would need some AST number added, and then it reg= isters the
ast to run on next return to userspace, for the= current thread.

Is it enough?
>
> Drew

> On Mon, Mar 18, 2024, at 2:33 PM, Konstantin Belousov = wrote:
> > On Mon, Mar 18, 2024 at 07:26:10AM -0500,= Mike Karels wrote:
> > > On 18 Mar 2024, at 7:04= , tuexen@freebsd.org wrot= e:
> > > 
> > > >&= gt; On 18. Mar 2024, at 12:42, Nuno Teixeira <eduardo@freebsd.org> wrote:
> &g= t; > >>
> > > >> Hello all!
> > > >>
> > > >> It= works just fine!
> > > >> System performan= ce is OK.
> > > >> Using patch on main-n268= 841-b0aaf8beb126(-dirty).
> > > >>
> > > >> ---
> > > >> = net.inet.tcp.functions_available:
> > > >> = Stack           &= nbsp;           &= nbsp;   D Alias        = ;            = ;        PCB count
>= > > >> freebsd       &nb= sp;           &nb= sp;       freebsd    &= nbsp;           &= nbsp;         0
&g= t; > > >> rack       &nbs= p;           &nbs= p;        * rack   &nb= sp;           &nb= sp;           &nb= sp; 38
> > > >> ---
> >= > >>
> > > >> It would be so nice= that we can have a sysctl tunnable for this patch
> &g= t; > >> so we could do more tests without recompiling kernel.
> > > > Thanks for testing!
> = > > >
> > > > @gallatin: can you come= up with a patch that is acceptable for Netflix
> > = > > and allows to mitigate the performance regression.
> > > 
> > > Ideally, tcphpts co= uld enable this automatically when it starts to be
> &g= t; > used (enough?), but a sysctl could select auto/on/off.
=
> > There is already a well-known mechanism to request execut= ion of the
> > specific function on return to usersp= ace, namely AST.  The difference
> > with the c= urrent hack is that the execution is requested for one callback
> > in the context of the specific thread.
>= ; > 
> > Still, it might be worth a try to u= se it; what is the reason to hit a thread
> > that d= oes not do networking, with TCP processing?
> > = ;
> > > 
> > > Mike
> > > 
> > > > Best= regards
> > > > Michael
> &g= t; > >>
> > > >> Thanks all!
> > > >> Really happy here :)
>= > > >>
> > > >> Cheers,
> > > >>
> > > >> Nu= no Teixeira <eduardo@freebsd.o= rg> escreveu (domingo, 17/03/2024 =C3=A0(s) 20:26):
> > > >>>
> > > >>> H= ello,
> > > >>>
> > = > >>>> I don't have the full context, but it seems like t= he complaint is a performance regression in bonnie++ and perhaps other t= hings when tcp_hpts is loaded, even when it is not used.  Is that c= orrect?
> > > >>>>
>= > > >>>> If so, I suspect its because we drive the tc= p_hpts_softclock() routine from userret(), in order to avoid tons of tim= er interrupts and context switches.  To test this theory,  you= could apply a patch like:
> > > >>>
=
> > > >>> It's affecting overall system per= formance, bonnie was just a way to
> > > >>= > get some numbers to compare.
> > > >>&= gt;
> > > >>> Tomorrow I will test patch= .
> > > >>>
> > >= >>> Thanks!
> > > >>>
> > > >>> --
> > > >&g= t;> Nuno Teixeira
> > > >>> FreeBSD C= ommitter (ports)
> > > >>
>= ; > > >>
> > > >>
> > > >> -- 
> > > >> = Nuno Teixeira
> > > >> FreeBSD Committer (p= orts)
> > > 
> > 


--49341a7599d5444a8bbcc6b7abbe0677--