Date: Sat, 12 Jul 1997 14:53:13 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: julian@whistle.com (Julian Elischer) Cc: hackers@FreeBSD.ORG Subject: Re: TCP bug in 2.2 Message-ID: <199707122153.OAA28921@phaeton.artisoft.com> In-Reply-To: <33C6D138.7D55368C@whistle.com> from "Julian Elischer" at Jul 11, 97 05:35:04 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> If I could borrow the ear of someone with more knowledge of TCP > states than me.. > > We see the following in a kernel dated from around March 4 > and from the logs it looks as if it's present in 2.2.2+ > > finger, (after a lot of iterations of the test) > goes into a permanent wait reading from a socket. > > the socket is seen to be in FIN_WAIT_2 state > after the finger proces is killed the socket STAYS in FIN_WAIT_2 > state forever. > > from what I've read in tcp_input.c etc. This shouldn't happen. > > 2 problems: > 1/ why doesn't finger wake up and return EOF? Probably you have an Annex or similar terminal server which has a buggy TCP/IP implementation which does not correctly do option negotiation. See /etc/sysconfig: ---------------------------------------------------------------------- # # Some broken implementations can't handle the RFC 1323 and RFC 1644 # TCP options. If TCP connections randomly hang, try disabling this, # and bug the vendor of the losing equipment. # tcp_extensions=NO ---------------------------------------------------------------------- Also, get an updated stack forwhatever hardware you have which is failing to implement TCP/IP according to the RFC's. There could be less obvious problems with the stack as well, so it's a good idea to not trust it until it's updated. > 2/ why doesn't the close() ofthe socket start > the 2MSL timer? This is generally the case with Winsock implementations in general and Microsoft's in particular. Microsoft OS's don't do resource tracking correctly, and so even though you can now tell that a program has exited in Windows95, the Windows 3.1 Winsock code still requires that the client application call "shutdown()" on the socket prior to closing it. Basically, Microsoft's TCP/IP stack is too stupid to send the FIN like it's supposed to on the close. This is mostly a problem if: 1) You reboot a machine without shuttding down all Winsock clients. 2) Your client program crashes and expect the OS to be able to back out state on its behalf. 3) The client software was ported from a sane TCP/IP environment, like UNIX, and the programmers have no idea that "shutdown()" is supposed to be called (amazingly enough, on UNIX systems, calling "shutdown()" shuts the machine down... who would have ever thought of naming a function for what the function does? Apparently not the originators of Winsock.). Try correcting your client software. Also try running client machines with OS's that can't be crashed by client programs (ie: real protected mode operating systems). Finally, try running an OS that knows how to recover resources that a program was using in the event of a program crash which does not crash the OS (ie: real protected mode operating systems). > and either tcp_usrclosed() is not being called > during the socket closure for some reason, > or the timer is being continually reset by something else. The timer is not started until the FIN is sent if SO_KEEPALIVE was specified by the client. It usually is. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199707122153.OAA28921>