From owner-freebsd-hackers Mon Jul 14 09:29:43 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id JAA00305 for hackers-outgoing; Mon, 14 Jul 1997 09:29:43 -0700 (PDT) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.50]) by hub.freebsd.org (8.8.5/8.8.5) with SMTP id JAA00300 for ; Mon, 14 Jul 1997 09:29:38 -0700 (PDT) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id JAA01510; Mon, 14 Jul 1997 09:22:53 -0700 From: Terry Lambert Message-Id: <199707141622.JAA01510@phaeton.artisoft.com> Subject: Re: TCP bug in 2.2 To: julian@whistle.com (Julian Elischer) Date: Mon, 14 Jul 1997 09:22:53 -0700 (MST) Cc: terry@lambert.org, hackers@FreeBSD.ORG In-Reply-To: from "Julian Elischer" at Jul 13, 97 01:39:45 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > > tcp_extensions=NO > > Terry, I am aware of that ok? Sorry, thought it was one of the other Julian's... > And yes TCP exensions WAS (to our surprise) enabled on that machine. > We have since turned it off, however despite this it should be > IMPOSSIBLE to hang a socket in that way. I agree. However... did turning them off affect the hanging at all? The information would be useful in tracking down the real bug. Just a shot in the dark... > it was SOLARIS 2.5.1 > and it didn't happen consitently so it's a race condition of some sort. 2.5.1 thinks it supports T/TCP, right? Maybe it's 2.5.1's T/TCP? [ ... ] > but no matter WHAT OS, we should not have a code path that can get to > the state that a tcp session is totally hung, without a timer running > for it. Sorry; I didn't know what patches you did or didn't have. The timers were relatively new. What about the recent commits that you had to fix the Appletalk stuff for ...is it possible that they are the source of the problem? > > 1) You reboot a machine without shuttding down all Winsock > > clients. > The solaris machine was not rebooted. Well, there's a nice mising notification theory shot to hell... 8-). > > The timer is not started until the FIN is sent if SO_KEEPALIVE was > > specified by the client. It usually is. > > I'll check this but it still seems to be a bug to me > because the socket is in FIN_WAIT2 state and HAS BEEN CLOSED. Yep; sorry for the assumptions... hope you find it. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.