From owner-freebsd-hackers  Mon Jan  4 15:57:08 1999
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id PAA25737
          for freebsd-hackers-outgoing; Mon, 4 Jan 1999 15:57:08 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id PAA25729
          for <freebsd-hackers@FreeBSD.ORG>; Mon, 4 Jan 1999 15:57:04 -0800 (PST)
          (envelope-from tlambert@usr05.primenet.com)
Received: (from daemon@localhost)
	by smtp03.primenet.com (8.8.8/8.8.8) id QAA06987;
	Mon, 4 Jan 1999 16:56:38 -0700 (MST)
Received: from usr05.primenet.com(206.165.6.205)
 via SMTP by smtp03.primenet.com, id smtpd006944; Mon Jan  4 16:56:36 1999
Received: (from tlambert@localhost)
	by usr05.primenet.com (8.8.5/8.8.5) id QAA23877;
	Mon, 4 Jan 1999 16:56:35 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <199901042356.QAA23877@usr05.primenet.com>
Subject: Re: tcp bug on reeBSD
To: fenner@parc.xerox.com (Bill Fenner)
Date: Mon, 4 Jan 1999 23:56:35 +0000 (GMT)
Cc: tlambert@primenet.com, freebsd-hackers@FreeBSD.ORG
In-Reply-To: <98Dec18.145610pst.177534@crevenia.parc.xerox.com> from "Bill Fenner" at Dec 18, 98 02:56:02 pm
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> If you get unlucky with delayed ACK's or your client is extremely
> slow, you might get
> server: FIN
> client: ACK
> client: FIN
> server: ACK

Well, Winsock clients are extremely slow.  They aren't very speedy,
either.  ;-).


> but from TCP's point of view, the client's FIN isn't related to the
> server's FIN; it's in response to the client application's request to
> close the connection.

No, you've got that backwards.

The FIN-WAIT-2 stuff only happens in the case of an encapsulated
protocol teardown, like the implied close by the server at the
end of an HTTP transfer, or the POP3 or SMTP case, where a QUIT<CR><LF>
is sent by the client, initiating a server shutdown of the connection.

When the server shuts down the connection, it expects the client to
do the ACK/CTL,ACK.  In the failure case, the client doesn't call
shutdown(2), and since the TCP/IP implementation is user space,
and sockets in Windows 95/98 are not file descriptors, there's
not OS-based resource tracking (like there is in UNIX) to imply a
shutdown(2) on behalf of the client application closing the descriptor
(or worse, just exiting without a close at all, leaving not even the
close/shutdown order inversion available for resource tracking to
be implied).  Yeah, they should put something in the WSOCK32.DLL
thread_detach or process_detach routine to handle automatic shutdown,
but Microsoft hasn't bothered to do this yet.


> >This behaviour should be implemented in FreeBSD as a sysctl; you
> >could call it "nt_bug_compatabile", but it's probably more correct
> >to call it "patch_fin_wait_2_bug".
> 
> You're suggesting that the timeout, instead of removing the state,
> pretend that the FIN wasn't acknowledged and switch to FIN_WAIT_1 and
> retransmit the "unacknowledged" FIN?

Yes.  Pretend you didn't get the "ACK" from the "FIN", and resend
it.  This will elicit either no response from the client (powered
off, etc.), an RST (improper shutdown, go ahead and tear down the
server end), or a "drain in progress" (e.g., an "ACK" for the "FIN"
for which the "ACK" was "lost" by the server).

This is basically what NT does, and it's what Paul Vixie added
to his version of NetBSD (from my interpretation of his description).


> >With this enabled, you can get rid of the long timeout kludge, as
> >well.
> 
> Well, you just do something different when the timeout occurs, n'est-ce
> pas?

No.  Unless "the timeout" is reduced from 30 minutes to 2 MSL, so
you're calling it the same timeout.

Technically, this is a bug in the TCP protocol as defined by RFC
793, since you can't expect a client machine to not crash between the
"ACK" and the "CTL,ACK", and if it does, there's no recovery possible
on the server side of things.

Also, a 30 minute timeout is bad.  It's perfectly valid for a client
to take days between the encapsulated server shutdown and the client
calling shutdown(2).

The downside is a lot of server to client activity.  You can solve
that by starting at 2 msl, and so long as you are getting "ACK"'s from
the client for your repeat "FIN" for your "lost" "ACK", you can do
an exponential back-off (NT does *not* do this -- they basically
"FIN"-flood the client every 2 MSL until it "ACK"'s or "RST"'s).

An exponential back-off is probably overkill for an initial try
at the fix, unless people see link degradation as a result of the
fix going in (unlikely, unless clients are doing rather evil things;
you could concieve a DOS atack using an intentionally misbehaving
client machine, which a 30 minute timeout *and* an exponential backoff
would do a lot to resolve.  You might want to change the close code,
as well, letting the close context for data that may need to be
retransmitted to linger, but causing it to return to the server
immediately anyway so as to allow you to unload the server image and
recover all but the lingering close context -- in the kernel -- for
reuse).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message