Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Oct 2012 22:11:41 -0400
From:      George Neville-Neil <gnn@neville-neil.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-net@freebsd.org, net@freebsd.org
Subject:   Re: Dropping TCP options from retransmitted SYNs considered harmful
Message-ID:  <70397C6E-202A-4FAE-AF53-6A5A1D89FAAC@neville-neil.com>
In-Reply-To: <201210150908.36498.jhb@freebsd.org>
References:  <201210121213.11152.jhb@freebsd.org> <201210150908.36498.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Oct 15, 2012, at 09:08 , John Baldwin <jhb@freebsd.org> wrote:

> On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote:
>> Back in 2001 FreeBSD added a hack to strip TCP options from =
retransmitted SYNs=20
>> starting with the 3rd SYN in this block in tcp_timer.c:
>>=20
>> 	/*
>> 	 * Disable rfc1323 if we haven't got any response to
>> 	 * our third SYN to work-around some broken terminal servers
>> 	 * (most of which have hopefully been retired) that have bad VJ
>> 	 * header compression code which trashes TCP segments containing
>> 	 * unknown-to-them TCP options.
>> 	 */
>> 	if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D =
3))
>> 		tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP);
>>=20
>> There is even a PR for the original bug report: kern/1689
>>=20
>> However, there is an unintended consequence of this change that can =
be=20
>> disastrous.  Specifically, suppose you have a FreeBSD client =
connecting to a=20
>> server, and that the SYNs are arriving at the server successfully, =
but the=20
>> first few return SYN/ACKs are dropped.  Eventually a SYN/ACK makes it =
through=20
>> and the connection is established.
>>=20
>> The server (based on the first SYN it saw) believes it has negotiated =
window=20
>> scaling with the client.  The client, however, has broken what it =
promised in=20
>> that first SYN and believes it is not using any window scaling at =
all.  This=20
>> causes two forms of breakage:
>>=20
>> 1) When the server advertises a scaled window (e.g. '8' for a 64k =
window
>>    scaled at 13), the client thinks it is an unscaled window ('8') =
and
>>    sends data to the server very slowly.
>>=20
>> 2) When the client advertises an unscaled window (e.g. '65535' for a =
64k
>>    window), the server thinks it has a huge window (65535 << 13 =3D=3D =
511MB)
>>    to send into.
>>=20
>> I'm not sure that 2) is a problem per se, but I have definitely seen =
instances=20
>> of 1) (and examined the 'struct tcpcb' in kgdb on both the server and =
client=20
>> end of the connections to verify they disagreed on the scaling).
>>=20
>> The original motivation of this change is to work around broken =
terminal=20
>> servers that were old when this change was added in 2001.  Over 10 =
years later=20
>> I think we should at least have an option to turn this work-around =
off, and=20
>> possibly disable it by default.
>>=20
>> Thoughts?
>=20
> How about this:
>=20
> Index: tcp_timer.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- tcp_timer.c	(revision 241579)
> +++ tcp_timer.c	(working copy)
> @@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, =
CTLFL
> 	/* max idle probes */
> int	tcp_maxpersistidle;
>=20
> +static int	tcp_rexmit_drop_options =3D 0;
> +SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW,
> +    &tcp_rexmit_drop_options, 0,
> +    "Drop TCP options from 3rd and later retransmitted SYN");
> +
> static int	per_cpu_timers =3D 0;
> SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW,
>     &per_cpu_timers , 0, "run tcp timers on all cpus");
> @@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp)
> 	 * header compression code which trashes TCP segments containing
> 	 * unknown-to-them TCP options.
> 	 */
> -	if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D =
3))
> +	if (tcp_rexmit_drop_options && (tp->t_state =3D=3D =
TCPS_SYN_SENT) &&
> +	    (tp->t_rxtshift =3D=3D 3))
> 		tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP);
> 	/*
> 	 * If we backed off this far, our srtt estimate is probably =
bogus.
>=20
> Any other suggestions on the sysctl name?

The name's fine.  Commit that sucker and turn it off.

Best,
George




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?70397C6E-202A-4FAE-AF53-6A5A1D89FAAC>