From owner-freebsd-current Tue Dec 21 12:50:54 1999 Delivered-To: freebsd-current@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id B486D15146; Tue, 21 Dec 1999 12:50:50 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id MAA84567; Tue, 21 Dec 1999 12:50:50 -0800 (PST) (envelope-from dillon) Date: Tue, 21 Dec 1999 12:50:50 -0800 (PST) From: Matthew Dillon Message-Id: <199912212050.MAA84567@apollo.backplane.com> To: John Baldwin Cc: FreeBSD Hackers Subject: Odd TCP glitches in new currents Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG I think this may be due to the timing changes. While typing over a TCP connection, running a remote X client (such as netscape), and so forth I sometimes see momentary 1/10 second hangs, even on a clean, empty network. For a while I thought it was packet loss, but then I realized that it wasn't (plus I replaced my HUB with a switch). ping always succeeds wonderfully, even -f, even with various packet sizes, yet TCP connections get momentary hangs. On machine A 527 data packets (528796 bytes) retransmitted 527 data packets (528796 bytes) retransmitted 527 data packets (528796 bytes) retransmitted 527 data packets (528796 bytes) retransmitted 528 data packets (528872 bytes) retransmitted 528 data packets (528872 bytes) retransmitted 528 data packets (528872 bytes) retransmitted On machine B 658 completely duplicate packets (197326 bytes) 659 completely duplicate packets (197402 bytes) 659 completely duplicate packets (197402 bytes) I think the problem is that the TCP connection is not waiting long enough for the returned ack. lander:/home/dillon# route -n get apollo route to: 216.240.41.2 destination: 216.240.41.2 interface: rl0 flags: recvpipe sendpipe ssthresh rtt,msec rttvar hopcount mtu expire 0 0 8015 24 22 0 1500 1185 I have NOT tested this fix yet, so I don't know if it works, but I believe the problem is that on high speed networks the milliscond round trip delay is short enough that you can get 1-tick timeouts. 24 msec translates to how many ticks? Essentially just 1. What happens when you specify a 1-tick timeout and the tick interrupt occurs a microsecond later? For that matter what happens when you want 1.5 ticks worth of timeout? Do you get only 1? What happens is that the TCP stack thinks it timed out when it only just sent the packet a few microseconds ago. Here is a hack that should fix the problem. The question is whether to do it here or whether to just add a tick gratuitously in callout_reset(). -Matt Index: tcp_input.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/tcp_input.c,v retrieving revision 1.99 diff -u -r1.99 tcp_input.c --- tcp_input.c 1999/12/14 15:43:56 1.99 +++ tcp_input.c 1999/12/21 20:47:12 @@ -651,7 +651,7 @@ callout_stop(tp->tt_rexmt); else if (!callout_active(tp->tt_persist)) callout_reset(tp->tt_rexmt, - tp->t_rxtcur, + tp->t_rxtcur + 1, tcp_timer_rexmt, tp); sowwakeup(so); @@ -1541,7 +1541,7 @@ callout_stop(tp->tt_rexmt); needoutput = 1; } else if (!callout_active(tp->tt_persist)) - callout_reset(tp->tt_rexmt, tp->t_rxtcur, + callout_reset(tp->tt_rexmt, tp->t_rxtcur + 1, tcp_timer_rexmt, tp); /* Index: tcp_output.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/tcp_output.c,v retrieving revision 1.36 diff -u -r1.36 tcp_output.c --- tcp_output.c 1999/08/30 21:17:06 1.36 +++ tcp_output.c 1999/12/21 20:47:09 @@ -668,7 +668,7 @@ */ if (!callout_active(tp->tt_rexmt) && tp->snd_nxt != tp->snd_una) { - callout_reset(tp->tt_rexmt, tp->t_rxtcur, + callout_reset(tp->tt_rexmt, tp->t_rxtcur + 1, tcp_timer_rexmt, tp); if (callout_active(tp->tt_persist)) { callout_stop(tp->tt_persist); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message