Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 9 Apr 1999 03:18:23 +0200 (CEST)
From:      Martin Kammerhofer <dada@balu.kfunigraz.ac.at>
To:        Julian Elischer <julian@whistle.com>
Cc:        freebsd-net@FreeBSD.ORG
Subject:   Re: Coping with 1000s of W95 clients.
Message-ID:  <Pine.BSF.3.96.990409015507.558A-100000@localhost.kfunigraz.ac.at>
In-Reply-To: <Pine.BSF.3.95.990406183258.1119A-100000@current1.whistle.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 6 Apr 1999, Julian Elischer wrote:

> 
> The 'canonical example' is Win95 machines that 
> don't "shutdown()" the TCP session before exiting.
> In the following situation you are left wit an entry on your server
> sitting in FIN_WAIT_2 state.
> 
Even FreeBSD boxes are causing this :(. I see this every day on my home
box where a local apache and a netscape browser are running. HTTP 1.1
introduced keepalive connections where the server keeps the connection
open for some 15 sec after servicing the request(s). The idea was to cut
down on connection setup overhead. If a client has to open for each of
dozens of inlined GIFs contained in a page a new TCP connection (as it was
the case with HTTP 1.0) performance will suffer.
Now when the server closes the
connection after no further request came in for 15sec a FIN will be sent
and acknowledged from the browser's OS. After that the server's TCP is in
FIN_WAIT_2 state and the browser's in CLOSE_WAIT. If the browser would
periodically check his sockets, read zero length from them, notice that
the server closed and close the socket too, all would be fine.
Unfortunately browsers like Netscape (at least up to 4.08) just sit idle
until the user accesses her next webpage - maybe idling for days!
The problem is well known, those having installed apache from the FreeBSD
ports collection can read about it in
file:/usr/local/share/doc/apache/manual/misc/fin_wait_2.html .

> 
> The BSD4.3 hack is to have a (11 minute, 15 second) timeout on FIN_WAIT_2
> state **IF THE LOCAL END HAS DONE A FULL CLOSE**. A notable example of 

If an application really does shut down _only_ the socket's output
side with shutdown(socket,how=1) then it wants to keep the socket
open for further reads! Timing out a half duplex connection is plain
wrong.

The timeout is actually tcp_maxidle = tcp_keepcnt * tcp_keepintvl;
Keepcnt is 8 (hard coded) and keepintvl is settable by sysctl
(net.inet.tcp.keepintvl). Eight times the default keepintvl of 150
is 1200 or 10 minutes. (Those TCP timers run with 2 Hz.)
Because
 - the idle time counter is incremented _after_ the timer is run,
 - and the condition for waiting another 75sec (=keepintvl) is
   ``tp->t_idle<= tcp_maxidle'' instead of ``tp->t_idle < tcp_maxidle''
another keepintvl is added so it's 9 * 75 = 675sec total.

>
> The only way to stop this is to break the standard, as this would be

FreeBSD's 675sec timeout on FIN_WAIT_2 is already breaking RFC 793.

> Basically, any session that is still in FIN_WAIT_2 after 30 seconds
> reverts to FIN_WAIT_1, and resends the FIN. I believe that
> this is similar to a fix Paul Vixie mentionned implementing in NetBSD
> once.
> 
I don't think is a good solution. Retransmitting the FIN certainly doesn't
break the spec, but it won't help much. If there were dead browsers ``on
the other side'' of all those annoying FIN_WAIT_2 sockets, then 
retransmitting and getting ACKs or RSTs would certainly help. But in most
cases there is a browser just waiting for user actions! Resending the
FIN would accomplish nothing in this case. The browsers TCP stack would
reacknowledge the FIN and continue hanging around in CLOSE_WAIT.
I guess the percentage of cases where you get a RST or ICMP is quite low
and not worth the increase in net traffic.

The easy solution to shorten FIN_WAIT_2 is simple:

  sysctl -w net.inet.tcp.keepintvl=27

  This would give you a FIN_WAIT_2 timeout of 9 * 27 / 2 = 121.5 sec
  This should cut down the number of FW2 sockets by a factor of
  675/121 = 5.6 . The only drawback to this solution is, that keepalives
  won't work reliably any more. (I'm referring to the transport layer
  keepalives here, _not_ the HTTP 1.1 application layer keepalives).
  After 2 hours idle time there would be only a 2 min time window to
  respond before a keepalive connection is dropped.

This leads me two solution 2:

  Just introduce a configurable timeout for idle finwait2 sockets.
  This is a quite small change and less intrusive than your suggestion.
  Those who have so many hits that finwait2 sockets pile up could
  just lower the finwait2 timeout.

       Martin

Index: netinet/tcp_input.c
===================================================================
RCS file: /home/dada/cvsroot/src/netinet/tcp_input.c,v
retrieving revision 1.3
diff -u -u -r1.3 tcp_input.c
--- tcp_input.c	1999/04/06 19:28:25	1.3
+++ tcp_input.c	1999/04/08 22:17:58
@@ -1496,7 +1496,7 @@
 				 */
 				if (so->so_state & SS_CANTRCVMORE) {
 					soisdisconnected(so);
-					tp->t_timer[TCPT_2MSL] = tcp_maxidle;
+					tp->t_timer[TCPT_2MSL] = tcp_finwait2idle;
 				}
 				tp->t_state = TCPS_FIN_WAIT_2;
 			}
Index: netinet/tcp_timer.c
===================================================================
RCS file: /home/dada/cvsroot/src/netinet/tcp_timer.c,v
retrieving revision 1.4
diff -u -u -r1.4 tcp_timer.c
--- tcp_timer.c	1999/04/08 12:15:01	1.4
+++ tcp_timer.c	1999/04/08 23:34:49
@@ -85,6 +85,10 @@
 SYSCTL_INT(_net_inet_tcp, TCPCTL_KEEPINTVL, keepintvl,
 	CTLFLAG_RW, &tcp_keepintvl , 0, "");
 
+int	tcp_finwait2idle = TCPTV_FINWAIT2IDLE;
+SYSCTL_INT(_net_inet_tcp, TCPCTL_FINWAIT2IDLE, finwait2idle,
+	CTLFLAG_RW, &tcp_finwait2idle , 0, "");
+
 static int	always_keepalive = 0;
 SYSCTL_INT(_net_inet_tcp, OID_AUTO, always_keepalive,
 	CTLFLAG_RW, &always_keepalive , 0, "");
@@ -162,6 +166,10 @@
 		tp = intotcpcb(ip);
 		if (tp == 0 || tp->t_state == TCPS_LISTEN)
 			continue;
+		tp->t_idle++;
+		tp->t_duration++;
+		if (tp->t_rtt)
+			tp->t_rtt++;
 		for (i = 0; i < TCPT_NTIMERS; i++) {
 			if (tp->t_timer[i] && --tp->t_timer[i] == 0) {
 #ifdef TCPDEBUG
@@ -180,10 +188,6 @@
 #endif
 			}
 		}
-		tp->t_idle++;
-		tp->t_duration++;
-		if (tp->t_rtt)
-			tp->t_rtt++;
 tpgone:
 		;
 	}
@@ -235,10 +239,13 @@
 	 */
 	case TCPT_2MSL:
 		if (tp->t_state != TCPS_TIME_WAIT &&
-		    tp->t_idle <= tcp_maxidle)
+		    tp->t_idle < tcp_finwait2idle)
 			tp->t_timer[TCPT_2MSL] = tcp_keepintvl;
-		else
+		else {
+			if (tp->t_state == TCPS_FIN_WAIT_2)
+			    tcpstat.tcps_finwait2drops++;
 			tp = tcp_close(tp);
+		}
 		break;
 
 	/*
Index: netinet/tcp_timer.h
===================================================================
RCS file: /home/dada/cvsroot/src/netinet/tcp_timer.h,v
retrieving revision 1.1
diff -u -u -r1.1 tcp_timer.h
--- tcp_timer.h	1999/04/02 01:15:25	1.1
+++ tcp_timer.h	1999/04/08 22:13:15
@@ -101,6 +101,8 @@
 #define	TCPTV_KEEPINTVL	( 75*PR_SLOWHZ)		/* default probe interval */
 #define	TCPTV_KEEPCNT	8			/* max probes before drop */
 
+#define	TCPTV_FINWAIT2IDLE ( 120*PR_SLOWHZ)	/* max idle time in FINWAIT2 */
+
 #define	TCPTV_MIN	(  1*PR_SLOWHZ)		/* minimum allowable value */
 #define	TCPTV_REXMTMAX	( 64*PR_SLOWHZ)		/* max allowable REXMT value */
 
@@ -129,6 +131,8 @@
 #ifdef KERNEL
 extern int tcp_keepinit;		/* time to establish connection */
 extern int tcp_keepidle;		/* time before keepalive probes begin */
+extern int tcp_finwait2idle;		/* idle time until drop in FIN_WAIT_2 */
+
 extern int tcp_keepintvl;		/* time between keepalive probes */
 extern int tcp_maxidle;			/* time to drop after starting probes */
 extern int tcp_ttl;			/* time to live for TCP segs */
Index: netinet/tcp_usrreq.c
===================================================================
RCS file: /home/dada/cvsroot/src/netinet/tcp_usrreq.c,v
retrieving revision 1.2
diff -u -u -r1.2 tcp_usrreq.c
--- tcp_usrreq.c	1999/04/04 22:17:54	1.2
+++ tcp_usrreq.c	1999/04/08 22:16:52
@@ -833,7 +833,7 @@
 		soisdisconnected(tp->t_inpcb->inp_socket);
 		/* To prevent the connection hanging in FIN_WAIT_2 forever. */
 		if (tp->t_state == TCPS_FIN_WAIT_2)
-			tp->t_timer[TCPT_2MSL] = tcp_maxidle;
+			tp->t_timer[TCPT_2MSL] = tcp_finwait2idle;
 	}
 	return (tp);
 }
Index: netinet/tcp_var.h
===================================================================
RCS file: /home/dada/cvsroot/src/netinet/tcp_var.h,v
retrieving revision 1.2
diff -u -u -r1.2 tcp_var.h
--- tcp_var.h	1999/04/04 22:17:54	1.2
+++ tcp_var.h	1999/04/08 23:32:18
@@ -248,6 +248,7 @@
 	u_long	tcps_keeptimeo;		/* keepalive timeouts */
 	u_long	tcps_keepprobe;		/* keepalive probes sent */
 	u_long	tcps_keepdrops;		/* connections dropped in keepalive */
+	u_long	tcps_finwait2drops;	/* connections dropped in finwait2 */
 
 	u_long	tcps_sndtotal;		/* total packets sent */
 	u_long	tcps_sndpack;		/* data packets sent */
@@ -310,7 +311,8 @@
 #define	TCPCTL_SENDSPACE	8	/* send buffer space */
 #define	TCPCTL_RECVSPACE	9	/* receive buffer space */
 #define	TCPCTL_KEEPINIT		10	/* receive buffer space */
-#define TCPCTL_MAXID		11
+#define	TCPCTL_FINWAIT2IDLE	11	/* max idle time in FIN_WAIT_2 state */
+#define TCPCTL_MAXID		12
 
 #define TCPCTL_NAMES { \
 	{ 0, 0 }, \
@@ -324,6 +326,7 @@
 	{ "sendspace", CTLTYPE_INT }, \
 	{ "recvspace", CTLTYPE_INT }, \
 	{ "keepinit", CTLTYPE_INT }, \
+	{ "finwait2idle", CTLTYPE_INT }, \
 }
 
 #ifdef KERNEL
Index: netstat/inet.c
===================================================================
RCS file: /home/dada/cvsroot/src/netstat/inet.c,v
retrieving revision 1.1
diff -u -u -r1.1 inet.c
--- inet.c	1999/04/08 23:38:57	1.1
+++ inet.c	1999/04/08 23:42:27
@@ -253,6 +253,7 @@
 	p(tcps_keeptimeo, "\t%lu keepalive timeout%s\n");
 	p(tcps_keepprobe, "\t\t%lu keepalive probe%s sent\n");
 	p(tcps_keepdrops, "\t\t%lu connection%s dropped by keepalive\n");
+	p(tcps_finwait2drops, "\t%lu connection%s dropped in finwait2\n");
 	p(tcps_predack, "\t%lu correct ACK header prediction%s\n");
 	p(tcps_preddat, "\t%lu correct data packet header prediction%s\n");
 #undef p






To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.990409015507.558A-100000>