Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 01 Dec 2012 09:28:05 +0100
From:      Andre Oppermann <andre@freebsd.org>
To:        Keith Arner <vornum@gmail.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: Problems with ephemeral port selection
Message-ID:  <50B9BF95.2040103@freebsd.org>
In-Reply-To: <CAEo_tUH9LPzPFP-O=317rYEQ3nT66b4biQshV_8=L8hReO_BLg@mail.gmail.com>
References:  <CAEo_tUH9LPzPFP-O=317rYEQ3nT66b4biQshV_8=L8hReO_BLg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 30.11.2012 15:09, Keith Arner wrote:
> I've noticed some issues with ephemeral port number selection from
> tcp_connect(), which limit the number of concurrent, outgoing connections
> that can be established (connect(), rather than accept()).  Sifting through
> the source code, I believe the issuess stem from two problems in the
> tcp_connect() code path.  Specifically:
>
>   1) The wrong function gets called to determine if a given ephemeral
>      port number is currently usable.
>   2) The ephemeral port number gets selected without considering the
>      foreign addr/port.
>
> Curiously, the effect of #1 mostly cancels the effect of #2, such that
> the common calling convention gives you a correct result so long as you
> only have a small number of outgoing connections.  However, once you get to
> a large number of outgoing connections, things start to break down.  (I'll
> define large and small later.)
>
> As a side note, I have been working with FreeBSD 7.2.  The implementations
> of several of the relevant functions have been refactored somewhere between
> 7.2-RELEASE and 9-STABLE, but the core problems in the logic seem to be
> the same between versions.
>
> For problem #1, the code path that selects the ephemeral port number is:
>   tcp_connect() ->
>     in_pcbbind() ->
>       in_pcbbind_setup() ->
>         in_pcb_lport() [not in FreeBSD 7.2] ->
>           in_pcblookup_local()
>
> There is a loop in in_pcb_lport() [or directly in in_pcbbind_setup() in
> earlier releases] that considers candidate ephemeral port numbers and
> calls in_pcblookup_local() to determine if a given candidate is suitable.
> The default behaviour (if the caller has not set either SO_REUSEADDR or
> SO_REUSEPORT) is to pick a local port number that is not in use by
> *any* local TCP socket.
>
> So long as the number of concurrent, outgoing connections is less than the
> range configured by `sysctl net.inet.ip.portrange.*`, selecting a totally
> unique ephemeral port number works OK.  However, you cannot exceed that
> limit, even if each outgoing connection has a unique faddr/fport.  This
> does not limit the number of connections that can be accept()'ed, only the
> number of connections that can be connect()'ed.
>
> In this particular path, I think the code should call in_pcblookup_hash(),
> rather than in_pcblookup_local().  The criteria in in_pcblookup_hash() only
> match if the full 5-tuple matches, rather than just the local port number.
> The complication, of course, comes from the fact that in_pcbbind() is
> called from both bind() and for the implicit bind that happens for a
> connect().  The matching criteria in in_pcblookup_local() make sense for
> the former but not quite for the later.
>
> I mentioned that the above is the default behaviour you get when you don't
> specify SO_REUSEADDR or SO_REUSEPORT.  Setting SO_REUSEADDR
> before calling connect() has some surprizing consequences (surprizing in the
> sense that I don't believe SO_REUSEADDR is supposed to have any effect
> on connect()).  In this case, when in_pcblookup_local() is called, wild_okay
> is set to false.  This changes the matching criteria to (in effect) allow
> tcp_connect() to use the full 5-tuple space.  However, this brings us to the
> second problem.
>
> Problem #2 is that the ephemeral port number is chosen before the
> fport/faddr gets set on the pcb; that is tcp_connect() calls in_pcbbind() to
> select the ephemeral port number, *then* calls in_pcbconnect_setup() to
> populate the fport/faddr.  With SO_REUSEADDR, in_pcbbind() can select
> an in-use local port.  If the local port is used by a socket with a different
> laddr/fport/faddr, all is good.  However, if the local port selection
> results in a
> full conflict it will get rejected by the call to in_pcblookup_hash() inside
> in_pcbconnect_setup().  This happens *after* the loop inside
> in_pcbbind(), so the call to tcp_connect() fails with EADDRINUSE.  Thus,
> with SO_REUSEADDR, connect() can fail with EADDRINUSE long before
> the ephemeral port space has been exhausted.  The application could re-try
> the call to connect() and likely succeed, as a new local port would be
> selected.
>
> Overall, this behaviour hinders the ability to open a large number of
> outbound connections:
>   * If you don't specify SO_REUSEADDR, you have a fairly limited maximum
>     number of outbound connections.
>   * If you do specify SO_REUSEADDR, you are able to open a much larger
>     number of outbound connections, but must retry on EADDRINUSE.
>
> I believe that the logic under tcp_connect() should be modified to:
>
>   - behave uniformly whether or not SO_REUSEADDR has been set
>   - allow outgoing connection requests to re-use a local port number, so
>     long as the remaining elements of the tuple (laddr, fport, faddr) are
>     unique

Keith,

this is an excellent analysis.  Could you please file it as a problem
report too and post the PR-number here so we can better track it?
Thank you.

-- 
Andre




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50B9BF95.2040103>