Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 12 Jul 1997 14:53:13 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        julian@whistle.com (Julian Elischer)
Cc:        hackers@FreeBSD.ORG
Subject:   Re: TCP bug in 2.2
Message-ID:  <199707122153.OAA28921@phaeton.artisoft.com>
In-Reply-To: <33C6D138.7D55368C@whistle.com> from "Julian Elischer" at Jul 11, 97 05:35:04 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> If I could borrow the ear of someone with more knowledge of TCP
> states than me..
> 
> We see the following in a kernel dated from around March 4
> and from the logs it looks as if it's present in 2.2.2+
> 
> finger, (after a lot of iterations of the test)
> goes into a permanent wait reading from a socket.
> 
> the socket is seen to be in FIN_WAIT_2 state
> after the finger proces is killed the socket STAYS in FIN_WAIT_2
> state forever.
> 
> from what I've read in tcp_input.c etc. This shouldn't happen.
> 
> 2 problems:
> 1/ why doesn't finger wake up and return EOF?

Probably you have an Annex or similar terminal server which has a
buggy TCP/IP implementation which does not correctly do option
negotiation.

See /etc/sysconfig:

----------------------------------------------------------------------
#
# Some broken implementations can't handle the RFC 1323 and RFC 1644
# TCP options.  If TCP connections randomly hang, try disabling this,
# and bug the vendor of the losing equipment.
#
tcp_extensions=NO
----------------------------------------------------------------------

Also, get an updated stack forwhatever hardware you have which is
failing to implement TCP/IP according to the RFC's.  There could be
less obvious problems with the stack as well, so it's a good idea
to not trust it until it's updated.



> 2/ why doesn't the close() ofthe socket start
>  the 2MSL timer?

This is generally the case with Winsock implementations in general
and Microsoft's in particular.  Microsoft OS's don't do resource
tracking correctly, and so even though you can now tell that a
program has exited in Windows95, the Windows 3.1 Winsock code still
requires that the client application call "shutdown()" on the socket
prior to closing it.  Basically, Microsoft's TCP/IP stack is too
stupid to send the FIN like it's supposed to on the close.

This is mostly a problem if:

1)	You reboot a machine without shuttding down all Winsock
	clients.

2)	Your client program crashes and expect the OS to be able
	to back out state on its behalf.

3)	The client software was ported from a sane TCP/IP environment,
	like UNIX, and the programmers have no idea that "shutdown()"
	is supposed to be called (amazingly enough, on UNIX systems,
	calling "shutdown()" shuts the machine down... who would have
	ever thought of naming a function for what the function does?
	Apparently not the originators of Winsock.).


Try correcting your client software.  Also try running client machines
with OS's that can't be crashed by client programs (ie: real protected
mode operating systems).  Finally, try running an OS that knows how to
recover resources that a program was using in the event of a program
crash which does not crash the OS (ie: real protected mode operating
systems).

> and either tcp_usrclosed() is not being called
> during the socket closure for some reason,
> or the timer is being continually reset by something else.

The timer is not started until the FIN is sent if SO_KEEPALIVE was
specified by the client.  It usually is.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199707122153.OAA28921>