From owner-freebsd-hackers  Sat May 23 13:38:55 1998
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id NAA28306
          for freebsd-hackers-outgoing; Sat, 23 May 1998 13:38:55 -0700 (PDT)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from smtp04.primenet.com (daemon@smtp04.primenet.com [206.165.6.134])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id NAA28272;
          Sat, 23 May 1998 13:38:34 -0700 (PDT)
          (envelope-from tlambert@usr07.primenet.com)
Received: (from daemon@localhost)
	by smtp04.primenet.com (8.8.8/8.8.8) id NAA21323;
	Sat, 23 May 1998 13:38:32 -0700 (MST)
Received: from usr07.primenet.com(206.165.6.207)
 via SMTP by smtp04.primenet.com, id smtpd021310; Sat May 23 13:38:28 1998
Received: (from tlambert@localhost)
	by usr07.primenet.com (8.8.5/8.8.5) id NAA12442;
	Sat, 23 May 1998 13:38:18 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <199805232038.NAA12442@usr07.primenet.com>
Subject: Re: TIME_WAIT/FIN_WAIT_2...
To: njs3@doc.ic.ac.uk (Niall Smart)
Date: Sat, 23 May 1998 20:38:18 +0000 (GMT)
Cc: tlambert@primenet.com, jas@flyingfox.com, mark@vmunix.com,
        hackers@FreeBSD.ORG, isp@FreeBSD.ORG
In-Reply-To: <E0ydDio-0001gv-00@oak67.doc.ic.ac.uk> from "Niall Smart" at May 23, 98 01:48:21 pm
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > C)	if you get CLOSE-WAIT, then goto (A).
> 
> I'm a little unclear as to what exactly what should happen when
> a TCP stack receives a packet when it is in CLOSE-WAIT state.  Are
> you relying on the following bahaviour documented in STD7, line ~2358?

Yes.  I should have said "if you think the client should be sending
you the packet it should when it goes to CLOSE-WAIT state, but you
aren't getting that packet" to be technically correct.


>     3.  If the connection is in a synchronized state (ESTABLISHED,
>     FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT),
>     any unacceptable segment (out of window sequence number or
>     unacceptible acknowledgment number) must elicit only an empty
>     acknowledgment segment containing the current send-sequence number
>     and an acknowledgment indicating the next sequence number expected
>     to be received, and the connection remains in the same state.
> 
> > D)	if you get no response in 2 MSL, or RST, then act as if
> > 	you had recieved the CLOSE-WAIT, transitioned to FIN-WAIT-2,
> > 	and subsequently recieved the LAST-ACK.
> 
> Uh-huh.
> 
> > E)	(potential "enhancement")  If you get no response, rather
> > 	than treating it as an RST, goto (A), but maintain the
> > 	FIN_WAIT_2_TIMEOUT kludge currently in place.
> 
> Can I suggest that if you receive a response after step C, which you call
> the CLOSE-WAIT response, then the TCP stack should remain in FIN-WAIT-2
> with an infinite timeout, because the response indicates that the remote
> TCP stack not broken and moreover that the remote client is not finished
> sending yet.  (i.e. the 11 minute timeout you mention later would not
> be used)

You can't do this.  You must constantly ask the client "Are you done
yet?  Are you done yet?" because you have no other method of
distinguishing a broken client from a non-broken client.

This sucks, but effectively, you have several problems that are
intractable if you do this:

1)	The Microsoft TCP stack client that isn't done the first
	time you ask it, but gets done later, and never calls
	shutdown like it should.

2)	A client machine that is disconnected from the net because
	it has been shut off, rebooted, or physically disconnected
	(mobile, dialup, whatever).

These will result in the same failure you are trying to avoid, and
you've just killed the avoidance behaviour.

I understand why you would want to suggest this: it narrows the
non-compliance window considerably.

As a practical consideration, I expect FreeBSD to ship a technically
compliant configuration by default (like it does with the RFC's
whose option negotiation breaks peoples hardware), and then have
practically everyone in the known universe running FreeBSD turn
on the workaround for the Microsoft brain damage.


> > Of of the main pains-in-the-ass in not calling "shutdown()" is
> > Netscape, BTW.
> 
> Well, the Windows TCP/IP stack should be sending the FIN when the
> socket is closed.

Well, the Windows TCP/IP stack doesn't know that the socket is
closed, because it's implemented un user space, does not use a
resource tracked object (a WinSock socket is *not* a file
descriptor), and basically *can't* do the right thing most
of the time.  It *could* do the right thing on explicit close
(ie: imply a shutdown) or on program exit, but doesn't.


And you can dictate standards to Microsoft until you are blue in
the face, and it won't help.

I could easily see this as being intentional as a denial of service
attack to make non-NT server machines look bad.

I could also see it as being a known bug that they are refusing to
fix because it points out a deficiency in TCP with mobile and
transiently running machines.

In most cases, you won't get any response, instead of an RST, since
the RST won't occur if there are zero WinSock applications running,
since there is no code to field the packet and decide it's bogus
loaded at all.


> > 11 mintues is too long a time compared to 6 MSL.  By sending a
> > "duplicate packet" to test the lividity of the client, you can
> > solicit a "keepalive" (or an RST).  If you get the RST, then you
> > recover from the client error.
> 
> And otherwise wait indefinately for the FIN?

Yes; if the client is alive, and not sending back zero sized window
responses.


> I think that this is a bug Microsoft would be eager to fix, after all,
> if it affects FreeBSD web servers it also affects NT web servers, as
> well as NT file servers, Exchange servers etc etc.


I think you are wrong.  Microsoft implements the "fix" I have stated,
and is not affected by the problem.

The "problem" would be that Microsoft clients cause UNIX servers
to behave badly, but NT servers are unaffected.

I would think that this would be a problem Microsoft would be eager
to exacerbate in order to make UNIX servers look less viable than
NT servers.


> Has anyone yet
> verified that this is indeed the observed behaviour with NT with all
> applied patches?  If so perhaps we should try and convince Microsoft to
> fix the problem first.  Although having said that, Terry's algorithm
> does have the nice side effect of allowing us to remove the 11 minute
> non-conformant FIN-WAIT-2 drop.

I would keep it in there to prevent zero window size response based
denial of service attacks on UNIX servers (enhancement "E").


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message