Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 May 1998 06:23:00 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        mark@vmunix.com (Mark Mayo)
Cc:        isp@FreeBSD.ORG, hackers@FreeBSD.ORG
Subject:   Re: TIME_WAIT/FIN_WAIT_2...
Message-ID:  <199805220623.XAA11218@usr04.primenet.com>
In-Reply-To: <19980521230948.A23199@vmunix.com> from "Mark Mayo" at May 21, 98 11:09:48 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> The somewhat odd thing is the extraordinary number of
> sockets left open in TIME_WAIT and FIN_WAIT_2. I roughly understand
> what they mean, but we're talking about 3000 entries here (about 400-500
> of which are FIN_WAIT_2, the rest are TIME_WAIT).. So I have ~3000
> sockets in TIME_WAIT/FIN and only about 100 ESTABLISHED.
> 
> Is this normal?? It doesn't seem like it to me. If not, what would be
> causing it, and what should I look at tuning on the Slowaris box??

This is a client bug, specifically with Windows WinSock clients, which
do not call "shutdown(2)" in the following way:

	shutdown( s, 1);

The '1' should be a '2', but many WinSock implementations fail to work
correctly if it isn't a '1'.

You should talk to Paul Vixie about this.

The fix is to be bug-compatible with Windows NT as a server, and to,
when you are in FIN_WAIT_2 state, back up to resend the FIN.

The problem is the lack of an ACK needed for a state transition in
the Windows TCP/IP implemenetation.

The root cause is badly written client code that assumes that the
system does resource tracking of sockets, such that when the client
program exits, a resource-track cleanup occurs, and sockets are shut
down correctly.

Alternately, you could blame it on Microsoft for writing an OS that
doesn't do resource tracking.  But you could blame Apple for the same
thing.

In any case, Paul has hacked NetBSD to do the right thing.

I've played with hacks to FreeBSD for the same thing.  Basically, it
times out and backs up, rememebring that it backed up, and if it gets
an abort when the FIN is resent (ie: the machine has rebooted or the
host is unreachable because it has disconnected from it's ISP), it
needs to rush forward to completion.

Obviously, it would have to be controlled via sysctl(2), since it
violates the RFC's all to hell.

This is a standard Windows "Denial of service to non-Windows OS's"
attack.  8-|.


FreeBSD pretends it has solved the problem with a timeout (Solaris
has a similar "fix"), but the latency is too high.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199805220623.XAA11218>