Date: Mon, 15 Oct 2012 22:11:41 -0400 From: George Neville-Neil <gnn@neville-neil.com> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-net@freebsd.org, net@freebsd.org Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful Message-ID: <70397C6E-202A-4FAE-AF53-6A5A1D89FAAC@neville-neil.com> In-Reply-To: <201210150908.36498.jhb@freebsd.org> References: <201210121213.11152.jhb@freebsd.org> <201210150908.36498.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Oct 15, 2012, at 09:08 , John Baldwin <jhb@freebsd.org> wrote: > On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote: >> Back in 2001 FreeBSD added a hack to strip TCP options from = retransmitted SYNs=20 >> starting with the 3rd SYN in this block in tcp_timer.c: >>=20 >> /* >> * Disable rfc1323 if we haven't got any response to >> * our third SYN to work-around some broken terminal servers >> * (most of which have hopefully been retired) that have bad VJ >> * header compression code which trashes TCP segments containing >> * unknown-to-them TCP options. >> */ >> if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D = 3)) >> tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP); >>=20 >> There is even a PR for the original bug report: kern/1689 >>=20 >> However, there is an unintended consequence of this change that can = be=20 >> disastrous. Specifically, suppose you have a FreeBSD client = connecting to a=20 >> server, and that the SYNs are arriving at the server successfully, = but the=20 >> first few return SYN/ACKs are dropped. Eventually a SYN/ACK makes it = through=20 >> and the connection is established. >>=20 >> The server (based on the first SYN it saw) believes it has negotiated = window=20 >> scaling with the client. The client, however, has broken what it = promised in=20 >> that first SYN and believes it is not using any window scaling at = all. This=20 >> causes two forms of breakage: >>=20 >> 1) When the server advertises a scaled window (e.g. '8' for a 64k = window >> scaled at 13), the client thinks it is an unscaled window ('8') = and >> sends data to the server very slowly. >>=20 >> 2) When the client advertises an unscaled window (e.g. '65535' for a = 64k >> window), the server thinks it has a huge window (65535 << 13 =3D=3D = 511MB) >> to send into. >>=20 >> I'm not sure that 2) is a problem per se, but I have definitely seen = instances=20 >> of 1) (and examined the 'struct tcpcb' in kgdb on both the server and = client=20 >> end of the connections to verify they disagreed on the scaling). >>=20 >> The original motivation of this change is to work around broken = terminal=20 >> servers that were old when this change was added in 2001. Over 10 = years later=20 >> I think we should at least have an option to turn this work-around = off, and=20 >> possibly disable it by default. >>=20 >> Thoughts? >=20 > How about this: >=20 > Index: tcp_timer.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- tcp_timer.c (revision 241579) > +++ tcp_timer.c (working copy) > @@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, = CTLFL > /* max idle probes */ > int tcp_maxpersistidle; >=20 > +static int tcp_rexmit_drop_options =3D 0; > +SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW, > + &tcp_rexmit_drop_options, 0, > + "Drop TCP options from 3rd and later retransmitted SYN"); > + > static int per_cpu_timers =3D 0; > SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW, > &per_cpu_timers , 0, "run tcp timers on all cpus"); > @@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp) > * header compression code which trashes TCP segments containing > * unknown-to-them TCP options. > */ > - if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D = 3)) > + if (tcp_rexmit_drop_options && (tp->t_state =3D=3D = TCPS_SYN_SENT) && > + (tp->t_rxtshift =3D=3D 3)) > tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP); > /* > * If we backed off this far, our srtt estimate is probably = bogus. >=20 > Any other suggestions on the sysctl name? The name's fine. Commit that sucker and turn it off. Best, George
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?70397C6E-202A-4FAE-AF53-6A5A1D89FAAC>