From nobody Sat Feb 3 02:19:41 2024 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TRbvQ2BgNz58nyd; Sat, 3 Feb 2024 02:20:02 +0000 (UTC) (envelope-from gallatin@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TRbvQ1WWTz4d94; Sat, 3 Feb 2024 02:20:02 +0000 (UTC) (envelope-from gallatin@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1706926802; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=etFOi/HDfAtXaVCgcNdajbpYwcPoJvXj7bW98kLT/po=; b=NfQUwXtYrjwXJuRycqG9AMtFvc9AD3FqMzTyKIdWPYbuGRjXhgzEG9OoxM3DpVdlzPn6Zg ISsCksi54gNvCe5cOgFV2GfICF06RpLWXklRdEKnJV3bnBWq8lz336O1wlrD8f802TLaKF +Dpj8zhRmprcIulqs9uHgVwA8OCdz2hSI+rYGiYs8ryEzVShZqbZdZvvyvYvy9FQHX2M/9 5aOgNvftHBp25rTDV8gc7kZqayj3TGQY0ihNuiWW6duF862ZVbEb2NNRQKhHgnG+NBOTnz SHcrcwm69tQtcRBxZetUV1Xh4WSYaZgjCvPheF0L8ZTJ6kPi3Wrhv3VmtteoAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1706926802; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=etFOi/HDfAtXaVCgcNdajbpYwcPoJvXj7bW98kLT/po=; b=GJn5vXcU8PXqN6BCV6aeNpocNHK9GZgCMm0yk+By33va4SLvcNwTQL+GTYRGqqutLny29W qWyjwfHchMCEskTGpBdeFcaptIxJnvIwLevhIs2ID2AZ6gCO0cgkm45PaBXaVuQXWlo/81 ctwxqdWsqGFJgQN5kncf7r+flzvepv/DnkpjCn7O/Tbto8BVA6lRUmeSgLdETjf+fKQ5CZ O5uFCbbWoUuXKyqcvLpVxujD05d1TS7TosS3u1hRcTUuZE9/gPbd8TClQT1PA+V8X9QLKF Vwd/NWu1WT3uGI9jkyX0YV3ts+hvQd9UJhU9JT7MBZdC/MfOvG5DmTrV0jV2Zg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1706926802; a=rsa-sha256; cv=none; b=iY0bGE0hlhgJUq9hmfDKaq0Z8kYL3t/R454OIOpDr39dp/AkYneAnpz4QhD6wMmhU+OugS vkn2fnQMXfyJM9/i8id/BUFvClJfmtIaqbgBjtHXDMqzbLzA5HTSwlLD07jkzSrFqoh0ME 8rvZ7avWslvbiyPhpjTxaQWAe35jm/o0+bpq4wPpwZwISQFiActxOAiT0oK52AoH1W4DRd SNcXiV0i3f4dZuyj+jVCtjsu6beZjrGTKPsJgO4V/eIJsG/5JS2pbrWZtfCeJiEy6uBsDL 3j2KdsOheYYIoDczKTVWuLtIsK8RQiZaj7x9MU3JYfwKB0sgfWqELAhIz4heeA== Received: from auth1-smtp.messagingengine.com (auth1-smtp.messagingengine.com [66.111.4.227]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: gallatin) by smtp.freebsd.org (Postfix) with ESMTPSA id 4TRbvQ0Dp2z15Vv; Sat, 3 Feb 2024 02:20:02 +0000 (UTC) (envelope-from gallatin@freebsd.org) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailauth.nyi.internal (Postfix) with ESMTP id E525E27C0061; Fri, 2 Feb 2024 21:20:01 -0500 (EST) Received: from imap53 ([10.202.2.103]) by compute5.internal (MEProxy); Fri, 02 Feb 2024 21:20:01 -0500 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfeduhedggeeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvfevufgtsegrtderreerredtnecuhfhrohhmpedfffhr vgifucfirghllhgrthhinhdfuceoghgrlhhlrghtihhnsehfrhgvvggsshgurdhorhhgqe enucggtffrrghtthgvrhhnpeeggfeugeevuedtuedvleefffduteegtdffudeihefhgfeg feekffeiueevkeeuudenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrih hlfhhrohhmpehgrghllhgrthhinhdomhgvshhmthhprghuthhhphgvrhhsohhnrghlihht hidqudeffeehledvvdduiedqvdelhedtgedukeegqdhgrghllhgrthhinheppehfrhgvvg gsshgurdhorhhgsehfrghsthhmrghilhdrtghomh X-ME-Proxy: Feedback-ID: i41414658:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id A1D73364006B; Fri, 2 Feb 2024 21:20:01 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-144-ge5821d614e-fm-20240125.002-ge5821d61 List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 Message-Id: <2fac0ac3-ba3a-4bca-b0d4-fafb0c5b75fd@app.fastmail.com> In-Reply-To: References: <2c31ac44-b34b-469c-a6de-fdd927ec2f9e@freebsd.org> Date: Fri, 02 Feb 2024 21:19:41 -0500 From: "Drew Gallatin" To: "Rick Macklem" Cc: "Richard Scheffenegger" , "freebsd-net@FreeBSD.org" , "FreeBSD Transport" , rmacklem@freebsd.org, kp@freebsd.org Subject: Re: Increasing TCP TSO size support Content-Type: multipart/alternative; boundary=40239dadffc4465dbc528566ad3b21da --40239dadffc4465dbc528566ad3b21da Content-Type: text/plain On Fri, Feb 2, 2024, at 9:05 PM, Rick Macklem wrote: > > But the page size is only 4K on most platforms. So while an M_EXTPGS mbuf can hold 5 pages (..from memory, too lazy to do the math right now) and reduces socket buffer mbuf chain lengths by a factor of 10 or so (2k vs 20k per mbuf), the S/G list that a NIC will need to consume would likely decrease only by a factor of 2. And even then only if the busdma code to map mbufs for DMA is not coalescing adjacent mbufs. I know busdma does some coalescing, but I can't recall if it coalesces physcally adjacent mbufs. > > I'm guessing the factor of 2 comes from the fact that each page is a > contiguous segment? Actually, no, I'm being dumb. I was thinking that pages would be split up, but that's wrong. Without M_EXTPGS, each mbuf generated by sendfile (or nfs) would be an M_EXT with a wrapper around a single 4K page. So the scatter/gather list would be exactly the same. The win would be if the pages themselves were contiguous (which they often are), and if the bus_dma mbuf mapping code coalesced those segments, and if the device could handle DMA across a 4K boundary. That's what would get you shorter s/g lists. I think tcp_m_copy() can handle this now, as if_hw_tsomaxsegsize is set by the driver to express how long the max contiguous segment they can handle is. BTW, I really hate the mixing of bus dma restrictions with the hw_tsomax stuff. It always makes my head explode.. Drew --40239dadffc4465dbc528566ad3b21da Content-Type: text/html Content-Transfer-Encoding: quoted-printable

=
On Fri, Feb 2, 2024, at 9:05 PM, Rick Macklem wrote:
<= /div>
> But the pa= ge size is only 4K on most platforms.  So while an M_EXTPGS mbuf ca= n hold 5 pages (..from memory, too lazy to do the math right now) and re= duces socket buffer mbuf chain lengths by a factor of 10 or so (2k vs 20= k per mbuf), the S/G list that a NIC will need to consume would likely d= ecrease only by a factor of 2.  And even then only if the busdma co= de to map mbufs for DMA is not coalescing adjacent mbufs.  I know b= usdma does some coalescing, but I can't recall if it coalesces physcally= adjacent mbufs.

I'm guessing the factor of= 2 comes from the fact that each page is a
contiguous segm= ent?

Actually, no, I'm being d= umb.  I was thinking that pages would be split up, but that's wrong= .  Without M_EXTPGS, each mbuf generated by sendfile (or nfs) would= be an M_EXT with a wrapper around a single 4K page.  So the scatte= r/gather list would be exactly the same.

Th= e win would be if the pages themselves were contiguous (which they often= are), and if the bus_dma mbuf mapping code coalesced those segments, an= d if the device could handle DMA across a 4K boundary.  That's what= would get you shorter s/g lists.

I think tcp_m_copy()= can handle this now, as if_hw_tsomaxsegsize is set by the driver to exp= ress how long the max contiguous segment they can handle is.

BTW, I really hate the mixing of bus dma restrictions = with the hw_tsomax stuff.  It always makes my head explode..

Drew

--40239dadffc4465dbc528566ad3b21da--