From nobody Mon Mar 18 19:13:11 2024 X-Original-To: net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Tz4JY0z7Mz5Dnb8; Mon, 18 Mar 2024 19:13:33 +0000 (UTC) (envelope-from gallatin@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Tz4JY0PwGz44Jt; Mon, 18 Mar 2024 19:13:33 +0000 (UTC) (envelope-from gallatin@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710789213; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LBPHA0DV2V4AA/r8GnMiwoXOA7ru9W10xJUHU6dWmyY=; b=xkXtsWbWGBgK/RI4ZpWEISyJ0SYG1afTzfwLf8Xakj9LWXPsQT8EXUybqhx2BFxjZqukEY RiCWlenfTqDhQdbNIybGUldF7VceMrwgOSPZt/nAr2wg2mKtoArbzKFNV171QoJACp7T0X ERVWXY5xCxaPWThDS/CpTu2XRoU9Cu4TmyWse4uSopKDjGymnoAT7+kcP3m5gLjlscIvwL EA1eF8wX8ohqwhm5YVx41eDQwwiftbkIKD7mlBBVahj0DxFYkTmCtq3pEaQzsOOhBo4Gss 2aH5+Et8yYCD/oeHGz4UnEfsKj/sIN2nwI27eNclrGzFT54rNOdrodPvqOpL7A== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1710789213; a=rsa-sha256; cv=none; b=VU2CMXmlKcoQJJUz5zJr0RorQiuNX8f3v3ZMBO7M+vrL+6yYnjzT80f28dUu7zA2tZN025 TA5bSGAm2mu9Sc7N0LvSGh+hg4F6FuNFgtjTCjQhVopFWPiymczMNCtvOgPRF3FPDlrllx mDhfnx1b4knaJcmSBo3SwjJXACoUV8fa2AU4nnk3yiE+rDL9pRB0c3uNb7fFmOC9TkwbDn Y3npys97HLxtv2YNYRVGQXFYPsQLAqXuT4f6Uf7FKpcYYdh0NhwKmr+5Vpkua6qiPnovuk Av7Jf0Ijzb3syEPvvAlQGbE3D87+dGUc6y6ELf48T/OSo0Zbo4LH6R1awTBHkg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710789213; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LBPHA0DV2V4AA/r8GnMiwoXOA7ru9W10xJUHU6dWmyY=; b=XBW4GC6kX1lbQ+rQfFGVWzDBDTBds4bQUgSAuwffCu3TnnHKtqc3hkoF06k3JwM3W0N2La oBE09Kcf+2mkaIioKVSInuubXfhbpdfw7YSLrMqSk+0o6OyAiNRuKmwwhvWk/SQE/Tvepi Om3llYeP4WdLR78NwUalGGjauCz4y6vsa19I5msuZj1DO9+D8ioAHRx4V/pXg7m5aqNVoO A7KnyQR/GTFwwmjVeNDw3W9oa6iyTwPvy10sMeiDNuAjLO4tEmeOlrVbnWXFMUDvnZ6AyZ BTn+gxKrhdYywZtxQt0rTp55kEiZpNg2+gDetu2XbINgWhQYfNNCMjJc1V+Xyw== Received: from fauth2-smtp.messagingengine.com (fauth2-smtp.messagingengine.com [103.168.172.201]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: gallatin) by smtp.freebsd.org (Postfix) with ESMTPSA id 4Tz4JX4pn9zPyv; Mon, 18 Mar 2024 19:13:32 +0000 (UTC) (envelope-from gallatin@freebsd.org) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfauth.nyi.internal (Postfix) with ESMTP id 8E7E71200066; Mon, 18 Mar 2024 15:13:31 -0400 (EDT) Received: from imap53 ([10.202.2.103]) by compute5.internal (MEProxy); Mon, 18 Mar 2024 15:13:31 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrkeejgdduvddvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvfevufgtsegrtderreerreejnecuhfhrohhmpedfffhr vgifucfirghllhgrthhinhdfuceoghgrlhhlrghtihhnsehfrhgvvggsshgurdhorhhgqe enucggtffrrghtthgvrhhnpefgteeluefggeevheefheejgedtvdelheejfffhgeeuhfel keevheeiieeuhfefieenucffohhmrghinhepmhhpihdqshifshdrohhrghenucevlhhush htvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehgrghllhgrthhinhdo mhgvshhmthhprghuthhhphgvrhhsohhnrghlihhthidqudeffeehledvvdduiedqvdelhe dtgedukeegqdhgrghllhgrthhinheppehfrhgvvggsshgurdhorhhgsehfrghsthhmrghi lhdrtghomh X-ME-Proxy: Feedback-ID: i41414658:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 52F5C36400BD; Mon, 18 Mar 2024 15:13:31 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-300-gdee1775a43-fm-20240315.001-gdee1775a List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 Message-Id: <8031cd99-ded8-4b06-93b3-11cc729a8b2c@app.fastmail.com> In-Reply-To: References: <4FF534F6-B35D-4596-8D1E-226AD1347AC8@freebsd.org> <6e795e9c-8de4-4e02-9a96-8fabfaa4e66f@app.fastmail.com> <6047C8EF-B1B0-4286-93FA-AA38F8A18656@karels.net> Date: Mon, 18 Mar 2024 15:13:11 -0400 From: "Drew Gallatin" To: "Konstantin Belousov" , "Mike Karels" Cc: tuexen , "Nuno Teixeira" , garyj@gmx.de, current@freebsd.org, net@freebsd.org, "Randall Stewart" Subject: Re: Request for Testing: TCP RACK Content-Type: multipart/alternative; boundary=01b96c257b37417295d61c17eb06343b --01b96c257b37417295d61c17eb06343b Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable I got the idea from https://people.mpi-sws.org/~druschel/publications/so= ft-timers-tocs.pdf The gist is that the TCP pacing stuff needs to run f= requently, and rather than run it out of a clock interrupt, its more eff= icient to run it out of a system call context at just the point where we= return to userspace and the cache is trashed anyway. The current impl= ementation is fine for our workload, but probably not idea for a generic= system. Especially one where something is banging on system calls. =20 Ast's could be the right tool for this, but I'm super unfamiliar with th= em, and I can't find any docs on them.=20 Would ast_register(0, ASTR_UNCOND, 0, func) be roughly equivalent to wha= t's happening here? Drew On Mon, Mar 18, 2024, at 2:33 PM, Konstantin Belousov wrote: > On Mon, Mar 18, 2024 at 07:26:10AM -0500, Mike Karels wrote: > > On 18 Mar 2024, at 7:04, tuexen@freebsd.org wrote: > >=20 > > >> On 18. Mar 2024, at 12:42, Nuno Teixeira wr= ote: > > >> > > >> Hello all! > > >> > > >> It works just fine! > > >> System performance is OK. > > >> Using patch on main-n268841-b0aaf8beb126(-dirty). > > >> > > >> --- > > >> net.inet.tcp.functions_available: > > >> Stack D Alias = PCB count > > >> freebsd freebsd = 0 > > >> rack * rack = 38 > > >> --- > > >> > > >> It would be so nice that we can have a sysctl tunnable for this p= atch > > >> so we could do more tests without recompiling kernel. > > > Thanks for testing! > > > > > > @gallatin: can you come up with a patch that is acceptable for Net= flix > > > and allows to mitigate the performance regression. > >=20 > > Ideally, tcphpts could enable this automatically when it starts to be > > used (enough?), but a sysctl could select auto/on/off. > There is already a well-known mechanism to request execution of the > specific function on return to userspace, namely AST. The difference > with the current hack is that the execution is requested for one callb= ack > in the context of the specific thread. >=20 > Still, it might be worth a try to use it; what is the reason to hit a = thread > that does not do networking, with TCP processing? >=20 > >=20 > > Mike > >=20 > > > Best regards > > > Michael > > >> > > >> Thanks all! > > >> Really happy here :) > > >> > > >> Cheers, > > >> > > >> Nuno Teixeira escreveu (domingo, 17/03/2024= =C3=A0(s) 20:26): > > >>> > > >>> Hello, > > >>> > > >>>> I don't have the full context, but it seems like the complaint = is a performance regression in bonnie++ and perhaps other things when tc= p_hpts is loaded, even when it is not used. Is that correct? > > >>>> > > >>>> If so, I suspect its because we drive the tcp_hpts_softclock() = routine from userret(), in order to avoid tons of timer interrupts and c= ontext switches. To test this theory, you could apply a patch like: > > >>> > > >>> It's affecting overall system performance, bonnie was just a way= to > > >>> get some numbers to compare. > > >>> > > >>> Tomorrow I will test patch. > > >>> > > >>> Thanks! > > >>> > > >>> -- > > >>> Nuno Teixeira > > >>> FreeBSD Committer (ports) > > >> > > >> > > >> > > >> --=20 > > >> Nuno Teixeira > > >> FreeBSD Committer (ports) > >=20 >=20 --01b96c257b37417295d61c17eb06343b Content-Type: text/html;charset=utf-8 Content-Transfer-Encoding: quoted-printable
I got the idea = from https://people.mpi-sws.org/~druschel/publications/s= oft-timers-tocs.pdf  The gist is that the TCP pacing stuff need= s to run frequently, and rather than run it out of a clock interrupt, it= s more efficient to run it out of a system call context at just the poin= t where we return to userspace and the cache is trashed anyway. &nb= sp; The current implementation is fine for our workload, but probably no= t idea for a generic system.  Especially one where something is ban= ging on system calls. 

Ast's could be= the right tool for this, but I'm super unfamiliar with them, and I can'= t find any docs on them.

Would ast_registe= r(0, ASTR_UNCOND, 0, func) be roughly equivalent to what's happening her= e?

Drew

On Mon= , Mar 18, 2024, at 2:33 PM, Konstantin Belousov wrote:
On Mon, Mar 18, 2024 at 07:2= 6:10AM -0500, Mike Karels wrote:
> On 18 Mar 2024, at 7= :04, tuexen@freebsd.org w= rote:

> >> On 18. Mar 20= 24, at 12:42, Nuno Teixeira <e= duardo@freebsd.org> wrote:
> >>
<= div>> >> Hello all!
> >>
&= gt; >> It works just fine!
> >> System perf= ormance is OK.
> >> Using patch on main-n268841-b= 0aaf8beb126(-dirty).
> >>
> >= > ---
> >> net.inet.tcp.functions_available:
> >> Stack      &nbs= p;           &nbs= p;        D Alias   &n= bsp;           &n= bsp;            P= CB count
> >> freebsd    &nbs= p;           &nbs= p;          freebsd &n= bsp;           &n= bsp;            0=
> >> rack      &nb= sp;           &nb= sp;         * rack  &n= bsp;           &n= bsp;           &n= bsp;  38
> >> ---
> >>= ;
> >> It would be so nice that we can have a sys= ctl tunnable for this patch
> >> so we could do m= ore tests without recompiling kernel.
> > Thanks for= testing!
> >
> > @gallatin: can= you come up with a patch that is acceptable for Netflix
&= gt; > and allows to mitigate the performance regression.

> Ideally, tcphpts could enable this autom= atically when it starts to be
> used (enough?), but a s= ysctl could select auto/on/off.
There is already a well-kn= own mechanism to request execution of the
specific functio= n on return to userspace, namely AST.  The difference
with the current hack is that the execution is requested for one callba= ck
in the context of the specific thread.
Still, it might be worth a try to use it; what is the reaso= n to hit a thread
that does not do networking, with TCP pr= ocessing?


> M= ike

> > Best regards
> > Michael
> >>
>= >> Thanks all!
> >> Really happy here :)
> >>
> >> Cheers,
=
> >>
> >> Nuno Teixeira <eduardo@freebsd.org> escreveu (do= mingo, 17/03/2024 =C3=A0(s) 20:26):
> >>>
<= /div>
> >>> Hello,
> >>>
> >>>> I don't have the full context, but it see= ms like the complaint is a performance regression in bonnie++ and perhap= s other things when tcp_hpts is loaded, even when it is not used.  = Is that correct?
> >>>>
> = >>>> If so, I suspect its because we drive the tcp_hpts_soft= clock() routine from userret(), in order to avoid tons of timer interrup= ts and context switches.  To test this theory,  you could appl= y a patch like:
> >>>
> >&= gt;> It's affecting overall system performance, bonnie was just a way= to
> >>> get some numbers to compare.
> >>>
> >>> Tomorrow I will= test patch.
> >>>
> >>= > Thanks!
> >>>
> >>= > --
> >>> Nuno Teixeira
>= >>> FreeBSD Committer (ports)
> >>
<= /div>
> >>
> >>
> &= gt;> -- 
> >> Nuno Teixeira
> >> FreeBSD Committer (ports)



--01b96c257b37417295d61c17eb06343b--