From owner-freebsd-hackers Mon Jan 4 15:57:08 1999 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id PAA25737 for freebsd-hackers-outgoing; Mon, 4 Jan 1999 15:57:08 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id PAA25729 for ; Mon, 4 Jan 1999 15:57:04 -0800 (PST) (envelope-from tlambert@usr05.primenet.com) Received: (from daemon@localhost) by smtp03.primenet.com (8.8.8/8.8.8) id QAA06987; Mon, 4 Jan 1999 16:56:38 -0700 (MST) Received: from usr05.primenet.com(206.165.6.205) via SMTP by smtp03.primenet.com, id smtpd006944; Mon Jan 4 16:56:36 1999 Received: (from tlambert@localhost) by usr05.primenet.com (8.8.5/8.8.5) id QAA23877; Mon, 4 Jan 1999 16:56:35 -0700 (MST) From: Terry Lambert Message-Id: <199901042356.QAA23877@usr05.primenet.com> Subject: Re: tcp bug on reeBSD To: fenner@parc.xerox.com (Bill Fenner) Date: Mon, 4 Jan 1999 23:56:35 +0000 (GMT) Cc: tlambert@primenet.com, freebsd-hackers@FreeBSD.ORG In-Reply-To: <98Dec18.145610pst.177534@crevenia.parc.xerox.com> from "Bill Fenner" at Dec 18, 98 02:56:02 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > If you get unlucky with delayed ACK's or your client is extremely > slow, you might get > server: FIN > client: ACK > client: FIN > server: ACK Well, Winsock clients are extremely slow. They aren't very speedy, either. ;-). > but from TCP's point of view, the client's FIN isn't related to the > server's FIN; it's in response to the client application's request to > close the connection. No, you've got that backwards. The FIN-WAIT-2 stuff only happens in the case of an encapsulated protocol teardown, like the implied close by the server at the end of an HTTP transfer, or the POP3 or SMTP case, where a QUIT is sent by the client, initiating a server shutdown of the connection. When the server shuts down the connection, it expects the client to do the ACK/CTL,ACK. In the failure case, the client doesn't call shutdown(2), and since the TCP/IP implementation is user space, and sockets in Windows 95/98 are not file descriptors, there's not OS-based resource tracking (like there is in UNIX) to imply a shutdown(2) on behalf of the client application closing the descriptor (or worse, just exiting without a close at all, leaving not even the close/shutdown order inversion available for resource tracking to be implied). Yeah, they should put something in the WSOCK32.DLL thread_detach or process_detach routine to handle automatic shutdown, but Microsoft hasn't bothered to do this yet. > >This behaviour should be implemented in FreeBSD as a sysctl; you > >could call it "nt_bug_compatabile", but it's probably more correct > >to call it "patch_fin_wait_2_bug". > > You're suggesting that the timeout, instead of removing the state, > pretend that the FIN wasn't acknowledged and switch to FIN_WAIT_1 and > retransmit the "unacknowledged" FIN? Yes. Pretend you didn't get the "ACK" from the "FIN", and resend it. This will elicit either no response from the client (powered off, etc.), an RST (improper shutdown, go ahead and tear down the server end), or a "drain in progress" (e.g., an "ACK" for the "FIN" for which the "ACK" was "lost" by the server). This is basically what NT does, and it's what Paul Vixie added to his version of NetBSD (from my interpretation of his description). > >With this enabled, you can get rid of the long timeout kludge, as > >well. > > Well, you just do something different when the timeout occurs, n'est-ce > pas? No. Unless "the timeout" is reduced from 30 minutes to 2 MSL, so you're calling it the same timeout. Technically, this is a bug in the TCP protocol as defined by RFC 793, since you can't expect a client machine to not crash between the "ACK" and the "CTL,ACK", and if it does, there's no recovery possible on the server side of things. Also, a 30 minute timeout is bad. It's perfectly valid for a client to take days between the encapsulated server shutdown and the client calling shutdown(2). The downside is a lot of server to client activity. You can solve that by starting at 2 msl, and so long as you are getting "ACK"'s from the client for your repeat "FIN" for your "lost" "ACK", you can do an exponential back-off (NT does *not* do this -- they basically "FIN"-flood the client every 2 MSL until it "ACK"'s or "RST"'s). An exponential back-off is probably overkill for an initial try at the fix, unless people see link degradation as a result of the fix going in (unlikely, unless clients are doing rather evil things; you could concieve a DOS atack using an intentionally misbehaving client machine, which a 30 minute timeout *and* an exponential backoff would do a lot to resolve. You might want to change the close code, as well, letting the close context for data that may need to be retransmitted to linger, but causing it to return to the server immediately anyway so as to allow you to unload the server image and recover all but the lingering close context -- in the kernel -- for reuse). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message