Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Mar 2011 08:38:09 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        "Stefan `Sec` Zehl" <sec@42.org>
Cc:        freebsd-net@freebsd.org, Doug Barton <dougb@freebsd.org>
Subject:   Re: The tale of a TCP bug
Message-ID:  <201103300838.09608.jhb@freebsd.org>
In-Reply-To: <20110328183810.GF23803@ice.42.org>
References:  <4D8B99B4.4070404@FreeBSD.org> <201103281423.52202.jhb@freebsd.org> <20110328183810.GF23803@ice.42.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, March 28, 2011 2:38:10 pm Stefan `Sec` Zehl wrote:
> Hi,
> 
> On Mon, Mar 28, 2011 at 14:23 -0400, John Baldwin wrote:
> > 
> > No, this is not really right.  Your patch from your blog is the best
> > fix actually.  The reason we want to let 'win' be larger than
> > TCP_MAXWIN is that if the remote end sends more data than we've
> > advertised but we have room in the socket buffer, we want to go ahead
> > and accept the data as valid and ACK it rather than dropping the data
> > that is beyond rcv_adv.  My change above to rcv_wnd would break this.
> > Also, for the TCPS_SYN_SENT case we don't know what 'rcv_scale' is
> > until just before we update 'rcv_adv'.  This should be the same 
> > as your patch:
> > 
> > Index: tcp_input.c
> > ===================================================================
> > --- tcp_input.c	(revision 220098)
> > +++ tcp_input.c	(working copy)
> > @@ -1756,7 +1756,8 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th,
> >  				(TF_RCVD_SCALE|TF_REQ_SCALE)) {
> >  				tp->rcv_scale = tp->request_r_scale;
> >  			}
> > -			tp->rcv_adv += tp->rcv_wnd;
> > +			tp->rcv_adv += imin(tp->rcv_wnd,
> > +			    TCP_MAXWIN << tp->rcv_scale);
> >  			tp->snd_una++;		/* SYN is acked */
> >  			/*
> >  			 * If there's data, delay ACK; if there's also a FIN
> > 
> 
> I've applied this to my test-VM, and as expected it now passes my two
> testcases. As far as I'm concerned this fixes it for me.
> 
> I'm interested to see if my adv_neg counting hack together with this
> patch still registers any hits. -- If nobody beats me to it, I'll try it
> out on my webserver tomorrow.

There is at least one case I know of related to a bug I reported earlier
where a window probe from a remote connection can cause rcv_nxt to advance
past rcv_adv by one.  However, I think we want to know about those cases,
and we should probably be treating rcv_adv - rcv_nxt as if it is zero in 
that case, not -1 (my patch in my original e-mail does just that in a
different place in tcp_output() when we calculate the window "for real").

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201103300838.09608.jhb>