From owner-freebsd-bugs Fri Mar 8 15:16:31 1996 Return-Path: owner-bugs Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id PAA20852 for bugs-outgoing; Fri, 8 Mar 1996 15:16:31 -0800 (PST) Received: from who.cdrom.com (who.cdrom.com [192.216.222.3]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id PAA20815 for ; Fri, 8 Mar 1996 15:16:26 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [204.156.134.254]) by who.cdrom.com (8.6.12/8.6.11) with ESMTP id OAA01880 for ; Fri, 8 Mar 1996 14:14:34 -0800 Received: (dillon@localhost) by apollo.backplane.com (8.6.12/8.6.5) id OAA00252; Fri, 8 Mar 1996 14:14:11 -0800 Date: Fri, 8 Mar 1996 14:14:11 -0800 From: Matthew Dillon Message-Id: <199603082214.OAA00252@apollo.backplane.com> To: bugs@FreeBSD.ORG Cc: "Garrett A. Wollman" Subject: srtt calculation bug .. found Sender: owner-bugs@FreeBSD.ORG X-Loop: owner-bugs@FreeBSD.ORG Precedence: bulk ok, I've found the bug with t_srtt ... the problem is mainly due to the lack of resolution to SLOWHZ (which is only 500ms). That coupled with the following: delta = rtt - 1 - (tp->t_srtt >> TCP_RTT_SHIFT); if ((tp->t_srtt += delta) <= 0) tp->t_srtt = 1; The above equation breaks down statistically for < 500 ms RTT's. You would think that if the rtt variable is 1 half the time and 2 half the time, that srtt should wind up at around 4 (200 ms). Unfortunately, that does not occur because this line: delta = rtt - 1 - (tp->t_srtt >> TCP_RTT_SHIFT); Fails to generate a negative number in the case where 'rtt' is 1 (i.e. delta time is 0 at a 500ms resolution). However, it DOES generate a positive number if rtt is 2. The (tp->t_srtt >> TCP_RTT_SHIFT) also fails to generate a negative number until t_srtt reaches 8, but by that time it's too late.. you effectively get a random number depending on when you sample rather then the rtt. Even if you round tp->t_srtt up by 1/2 (a value of 4), it still breaks down statistically. The proper solution is to balance the weighting of an rtt of 1 and an rtt of 2 (i.e. a delta time of 0 and a delta time of 1).... give the delta time of 0 a -1 value to match the delta time of 1, as follows: delta = rtt - 1 - (tp->t_srtt >> TCP_RTT_SHIFT); if (delta == 0) /* ADD ME */ delta = -1; /* ADD ME */ if ((tp->t_srtt += delta) <= 0) tp->t_srtt = 1; Now you have a reasonably balanced weighting and the statistical calculation no longer breaks down for round trip times < 500ms. There is one other problem, and that is where you update the route table rtt when t_rttupdated >= 16. 16 may not be a high enough number to handle round trip times < 10 ms with a 500 ms timer resolution. The solution is to either increase the timer resolution or to increase rttupdated... though obviously there are practical limits to how large rttupdated can be and still work. -- I have tested this on a slip link, and the route table entries come up with the correct round trip time with the fix in, and the completely incorrect round trip time without the fix. -Matt Matthew Dillon Engineering, BEST Internet Communications, Inc. [always include a portion of the original email in any response!]