Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Oct 2022 19:46:56 -0700
From:      Gleb Smirnoff <glebius@freebsd.org>
To:        src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org
Subject:   Re: git: 0d7445193abc - main - tcp: remove tcptw, the compressed timewait state structure
Message-ID:  <Yz%2BTIHtUuH5clfO8@FreeBSD.org>
In-Reply-To: <202210070235.2972Zc5S090599@gitrepo.freebsd.org>
References:  <202210070235.2972Zc5S090599@gitrepo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
  Hi

On Fri, Oct 07, 2022 at 02:35:38AM +0000, Gleb Smirnoff wrote:
T>     The memory savings the tcptw brought back in 2003 (see 340c35de6a2) no
T>     longer justify the complexity required to maintain it.  For longer
T>     explanation please check out the email [1].
T>     
T>     Surpisingly through almost 20 years the TCP stack functionality of
T>     handling the TIME_WAIT state with a normal tcpcb did not bitrot.  The
T>     existing tcp_input() properly handles a tcpcb in TCPS_TIME_WAIT state,
T>     which is confirmed by the packetdrill tcp-testsuite [2].
T>     
T>     This change just removes tcptw and leaves INP_TIMEWAIT.  The flag will
T>     be removed in a separate commit.  This makes it easier to review and
T>     possibly debug the changes.
T>     
T>     [1] https://lists.freebsd.org/archives/freebsd-net/2022-January/001206.html
T>     [2] https://github.com/freebsd-net/tcp-testsuite
T>     
T>     Differential revision:  https://reviews.freebsd.org/D36398

The memory savings cut here are more than just sizeof(struct tcpcb) -
sizeof(struct tcptw).  We also keep the socket around for duration of TIME_WAIT.
And the stuff that may hang off the socket, e.g. kTLS context.

I prepared a changeset that would free the socket when transitioning to TIME_WAIT:

https://reviews.freebsd.org/D36887

However, the patch adds back some complexity that we just removed.

It also introduces a problem with two recent features of pluggable TCP stacks
and massive setsockopt: we got sysctl_setsockopt() that would run setsockopt on
a pcb matching certain criteria. One of most common applications of this tool
is to switch from one TCP stack to the other. To execute sosetopt() the
sysctl_setsockopt() needs socket. With the compressed timewait structure, we
didn't have it. Now we don't, too. However, the compressed timewaits were
short-circuited to the default TCP stack and didn't depend on the pluggable
ones. With a regular tcpcb we still reference the stack. That makes it impossible
to switch off ALL existing connections from some stack with sysctl_setsockopt().

After todays discussion with mtuexen@, rscheff@ and peterlei@ we decided not to
rush with checking in D36887 and have more discussion whether the memory savings
worth the comlexity or not. But keep the revision around.

-- 
Gleb Smirnoff



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Yz%2BTIHtUuH5clfO8>