From owner-freebsd-current  Tue Dec 21 12:50:54 1999
Delivered-To: freebsd-current@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP
	id B486D15146; Tue, 21 Dec 1999 12:50:50 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id MAA84567;
	Tue, 21 Dec 1999 12:50:50 -0800 (PST)
	(envelope-from dillon)
Date: Tue, 21 Dec 1999 12:50:50 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199912212050.MAA84567@apollo.backplane.com>
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: FreeBSD Hackers <freebsd-current@FreeBSD.ORG>
Subject: Odd TCP glitches in new currents
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

    I think this may be due to the timing changes.  While typing over a
    TCP connection, running a remote X client (such as netscape), and so
    forth I sometimes see momentary 1/10 second hangs, even on a clean,
    empty network.

    For a while I thought it was packet loss, but then I realized that it
    wasn't (plus I replaced my HUB with a switch).  ping always succeeds
    wonderfully, even -f, even with various packet sizes, yet TCP connections
    get momentary hangs.

On machine A
                527 data packets (528796 bytes) retransmitted
                527 data packets (528796 bytes) retransmitted
                527 data packets (528796 bytes) retransmitted
                527 data packets (528796 bytes) retransmitted
                528 data packets (528872 bytes) retransmitted
                528 data packets (528872 bytes) retransmitted
                528 data packets (528872 bytes) retransmitted

On machine B
                658 completely duplicate packets (197326 bytes)
                659 completely duplicate packets (197402 bytes)
                659 completely duplicate packets (197402 bytes)

    I think the problem is that the TCP connection is not waiting long
    enough for the returned ack. 

lander:/home/dillon# route -n get apollo
   route to: 216.240.41.2
destination: 216.240.41.2
  interface: rl0
      flags: <UP,HOST,DONE,LLINFO,WASCLONED>
 recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
       0         0      8015        24        22         0      1500      1185 

    I have NOT tested this fix yet, so I don't know if it works, but I
    believe the problem is that on high speed networks the milliscond round
    trip delay is short enough that you can get 1-tick timeouts.

    24 msec translates to how many ticks?  Essentially just 1.

    What happens when you specify a 1-tick timeout and the tick interrupt
    occurs a microsecond later?  For that matter what happens when you want
    1.5 ticks worth of timeout?  Do you get only 1?  What happens is that
    the TCP stack thinks it timed out when it only just sent the packet a
    few microseconds ago.

    Here is a hack that should fix the problem.  The question is whether
    to do it here or whether to just add a tick gratuitously in
    callout_reset().

						-Matt

Index: tcp_input.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/tcp_input.c,v
retrieving revision 1.99
diff -u -r1.99 tcp_input.c
--- tcp_input.c	1999/12/14 15:43:56	1.99
+++ tcp_input.c	1999/12/21 20:47:12
@@ -651,7 +651,7 @@
 					callout_stop(tp->tt_rexmt);
 				else if (!callout_active(tp->tt_persist))
 					callout_reset(tp->tt_rexmt, 
-						      tp->t_rxtcur,
+						      tp->t_rxtcur + 1,
 						      tcp_timer_rexmt, tp);
 
 				sowwakeup(so);
@@ -1541,7 +1541,7 @@
 			callout_stop(tp->tt_rexmt);
 			needoutput = 1;
 		} else if (!callout_active(tp->tt_persist))
-			callout_reset(tp->tt_rexmt, tp->t_rxtcur,
+			callout_reset(tp->tt_rexmt, tp->t_rxtcur + 1,
 				      tcp_timer_rexmt, tp);
 
 		/*
Index: tcp_output.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/tcp_output.c,v
retrieving revision 1.36
diff -u -r1.36 tcp_output.c
--- tcp_output.c	1999/08/30 21:17:06	1.36
+++ tcp_output.c	1999/12/21 20:47:09
@@ -668,7 +668,7 @@
 		 */
 		if (!callout_active(tp->tt_rexmt) &&
 		    tp->snd_nxt != tp->snd_una) {
-			callout_reset(tp->tt_rexmt, tp->t_rxtcur,
+			callout_reset(tp->tt_rexmt, tp->t_rxtcur + 1,
 				      tcp_timer_rexmt, tp);
 			if (callout_active(tp->tt_persist)) {
 				callout_stop(tp->tt_persist);


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message