From owner-svn-src-head@FreeBSD.ORG  Fri Jun 12 03:13:03 2009
Return-Path: <owner-svn-src-head@FreeBSD.ORG>
Delivered-To: svn-src-head@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EC6CB106566C;
	Fri, 12 Jun 2009 03:13:02 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail11.syd.optusnet.com.au (mail11.syd.optusnet.com.au
	[211.29.132.192])
	by mx1.freebsd.org (Postfix) with ESMTP id 71FC38FC13;
	Fri, 12 Jun 2009 03:13:02 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from c122-106-159-184.carlnfd1.nsw.optusnet.com.au
	(c122-106-159-184.carlnfd1.nsw.optusnet.com.au [122.106.159.184])
	by mail11.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	n5C3CvYB026504
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 12 Jun 2009 13:12:58 +1000
Date: Fri, 12 Jun 2009 13:12:58 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@delplex.bde.org
To: John Baldwin <jhb@FreeBSD.org>
In-Reply-To: <200906111018.20703.jhb@freebsd.org>
Message-ID: <20090612125702.M22046@delplex.bde.org>
References: <200906101827.n5AIRFoR022115@svn.freebsd.org>
	<20090611124334.A21109@delplex.bde.org>
	<200906111018.20703.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org,
	src-committers@FreeBSD.org, Bruce Evans <brde@optusnet.com.au>
Subject: Re: svn commit: r193941 - head/sys/netinet
X-BeenThere: svn-src-head@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: SVN commit messages for the src tree for head/-current
	<svn-src-head.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-head>,
	<mailto:svn-src-head-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-head>
List-Post: <mailto:svn-src-head@freebsd.org>
List-Help: <mailto:svn-src-head-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-head>,
	<mailto:svn-src-head-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Jun 2009 03:13:03 -0000

On Thu, 11 Jun 2009, John Baldwin wrote:

> On Wednesday 10 June 2009 11:53:16 pm Bruce Evans wrote:
>> On Wed, 10 Jun 2009, John Baldwin wrote:
>>
>>> Log:
>>>  Change a few members of tcpcb that store cached copies of ticks to be ints
>>>  instead of unsigned longs.  This fixes a few overflow edge cases on 64-bit
>>>  platforms.  Specifically, if an idle connection receives a packet shortly
>>
>> I think the variables should still be unsigned (ints now).  Otherwise there
>> is at best benign undefined behaviour when the variables overflow at
>> INT_MAX, while the code is apparently designed to work with unsigned
>> values (it casts to int to look at differences between the unsigned values).
>
> I wanted to match 'ticks' and figured changing 'ticks' was a far larger
> headache.

By changing the signedness you get undefined behaviour for the other types
too, and risk new and/or different sign extension bugs.

>>> Modified: head/sys/netinet/tcp_input.c
>>> ==============================================================================
>>> --- head/sys/netinet/tcp_input.c	Wed Jun 10 18:26:02 2009	(r193940)
>>> +++ head/sys/netinet/tcp_input.c	Wed Jun 10 18:27:15 2009	(r193941)
>>> @@ -1778,7 +1778,7 @@ tcp_do_segment(struct mbuf *m, struct tc
>>> 	    TSTMP_LT(to.to_tsval, tp->ts_recent)) {
>>>
>>> 		/* Check to see if ts_recent is over 24 days old.  */
>>> -		if ((int)(ticks - tp->ts_recent_age) > TCP_PAWS_IDLE) {
>>> +		if ((ticks - tp->ts_recent_age) > TCP_PAWS_IDLE) {
>>
>> The variables are now ints, and there is now also overflow in the
>> subtraction.  E.g., INT_MAX - INT_MIN now overflows.  On 2's complement
>> machines it normally overflows to -2, which is probably the wrong value
>> (the large unsigned value 0xFFF...FEU is probably correct), but it is
>> the same value as is given by casting the difference of unsigned values
>> to int ((int)0xFFF...FEU = -2).  This is because, when representing times
>> mod UINT_MAX+1, all time differences up to UINT_MAX can be represented,
>> but only if we keep track of which operand is older (here we probably do
>> know that ts_recent_age is older); if we don't keep track then we normally
>> assume that (unsigned) differences smaller than INT_MAX mean that the
>> difference is nonnegative while differences larger than INT_MAX mean
>> that the difference is negative.  The signed interpreatation works well
>> if we know that the nonnegative differences never exceed INT_MAX.
>
> I believe in these cases that 1) the differences should never exceed INT_MAX,
> and 2) we do know which value should be older.

Except for limit variables like t_badrxtwin that start larger than `ticks'
and become smaller when the limit is reached.  Presumably the limit is only
reached in unusual cases, but we have to test to see if it is.  We must
test more often that every INT_MAX ticks to see the limit being reached, but
that is not a problem.

>>> Modified: head/sys/netinet/tcp_var.h
>>> ==============================================================================
>>> --- head/sys/netinet/tcp_var.h	Wed Jun 10 18:26:02 2009	(r193940)
>>> +++ head/sys/netinet/tcp_var.h	Wed Jun 10 18:27:15 2009	(r193941)
>>> @@ -139,8 +139,8 @@ struct tcpcb {
>>>
>>> 	u_int	t_maxopd;		/* mss plus options */
>>>
>>> -	u_long	t_rcvtime;		/* inactivity time */
>>> -	u_long	t_starttime;		/* time connection was established */
>>> +	int	t_rcvtime;		/* inactivity time */
>>> +	int	t_starttime;		/* time connection was established */
>>> 	int	t_rtttime;		/* round trip time */
>>> 	tcp_seq	t_rtseq;		/* sequence number being timed */
>>>
>>> @@ -167,7 +167,7 @@ struct tcpcb {
>>> 	u_char	rcv_scale;		/* window scaling for recv window */
>>> 	u_char	request_r_scale;	/* pending window scaling */
>>> 	u_int32_t  ts_recent;		/* timestamp echo data */
>>> -	u_long	ts_recent_age;		/* when last updated */
>>> +	int	ts_recent_age;		/* when last updated */
>>> 	u_int32_t  ts_offset;		/* our timestamp offset */
>>>
>>> 	tcp_seq	last_ack_sent;
>>> @@ -175,7 +175,7 @@ struct tcpcb {
>>> 	u_long	snd_cwnd_prev;		/* cwnd prior to retransmit */
>>> 	u_long	snd_ssthresh_prev;	/* ssthresh prior to retransmit */
>>> 	tcp_seq	snd_recover_prev;	/* snd_recover prior to retransmit */
>>> -	u_long	t_badrxtwin;		/* window for retransmit recovery */
>>> +	int	t_badrxtwin;		/* window for retransmit recovery */
>>> 	u_char	snd_limited;		/* segments limited transmitted */
>>> /* SACK related state */
>>> 	int	snd_numholes;		/* number of holes seen by sender */
>>>
>>
>> Should all be changed to u_int?
>>
>> There still seem to be some very nice sign extension/overflow bugs.
>> E.g., `ticks' is (bogusly) signed.  After changing the above variables
>> to int to be bug for bug compatible with `ticks', the semantics of
>> expressions not directly touched in this commit is changed too.  E.g.,
>> there is the expression `ticks < tp->t_badrxtwin'.  Changing the signedness
>> of the type of tp_t_badrxtwin stops promotion of `ticks' to unsigned in
>> this expression.  However, this expression seems to have been just broken
>> before, and the change only moves the bug slightly (from times near where
>> `ticks' wraps around at to to times near where `ticks' overflows at
>> INT_MAX).  To handle wraparound and avoid overflow, such expressions should
>> be written as (int)(ticks - tp->t_badrxtwin) < 0 where the variables have
>> unsigned type.  This seems to have already been done for all the other
>> variables changed in this commit, except the cast to int is missing in
>> some cases.
>
> Yes, I noticed the t_badrxtwin breakage but wasn't sure how best to fix it
> (or at least thought that it should be a separate followup since it was
> already broken with wraparound).  I considered making it work more like
> the other variables such as t_rcvtime that merely store a cached value of
> 'ticks' and then use 'ticks - foo' and compare it with the given interval.
> This would entail renaming 't_badrxtwin' to 't_badrxttime' or some such.
> That seems clearer to me than '(int)(ticks - t_badrxtwin) < 0'.

`ticks < t_limitvar' is more optimal than `ticks - t_startvar < max', but
harder to get right.  Compilers even try to transform the latter to the
former (e.g., for loop counters), but they cannot do this if overflow
or wraparound is a possibility (except overflow gives undefined behaviour
so compilers can do anything).  After fixing the former to
`(int)(ticks - t_limitvar) < 0' it has the same number of operations as
the latter and would only be more efficient if 0 is easier to load and/or
compare with than `max'.  But casting to (int) seems clear to me -- it is
idiomatic for recovering signed differences from circular counters.

Bruce