Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Dec 2006 22:35:58 -0500 (EST)
From:      Daniel Eischen <deischen@freebsd.org>
To:        John-Mark Gurney <gurney_j@resnet.uoregon.edu>
Cc:        Julian Elischer <jelischer@ironport.com>, Robert Watson <rwatson@freebsd.org>, David Xu <davidxu@freebsd.org>, freebsd-arch@freebsd.org
Subject:   Re: close() of active socket does not work on FreeBSD 6
Message-ID:  <Pine.GSO.4.64.0612212227410.2250@sea.ntplx.net>
In-Reply-To: <20061222020101.GC4982@funkthat.com>
References:  <32874.1165905843@critter.freebsd.dk> <20061220153126.G85384@fledge.watson.org> <Pine.GSO.4.64.0612201308220.23942@sea.ntplx.net> <200612210820.09955.davidxu@freebsd.org> <4589E7D2.9010608@ironport.com> <20061221152115.U83974@fledge.watson.org> <20061222020101.GC4982@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 21 Dec 2006, John-Mark Gurney wrote:

> Robert Watson wrote this message on Thu, Dec 21, 2006 at 15:22 +0000:
>>> I think you are only intersted in treads that are sleeping.. so you allow
>>> a sleeping thread to save a pointer to the fd (or whatever) on which it is
>>> sleeping, along with the sleep address.
>>>
>>> items that are not sleeping are either already returning, or are going to
>>> sleep, in which case they can check at that time.
>>
>> Hence my question about select and poll: should they throw an exception
>> state when a file descriptor is closed out from under them?  They often
>> sleep on hundreds or thousands of file descriptors, and not just one.
>
> IMO, your program is buggy if you close the file descriptor before
> everything is out of the kernel wrt the fd...  It means that your close
> statement isn't waiting for things to be cleanly shut down, and that
> you still have dangling reference counts to the parts of the code that
> is in the kernel...
>
> I used to expect something similar w/ an kqueue based event driven
> web server, and found that I had bugs due to assuming that I could
> close it whenever I want...  What happens if you close the fd between
> the time select returns and you process it?  What happens if the fd
> gets closed, and another thread (or an earlier fd that accepts
> connections) reuses that fd?  And then youre state machine isn't read
> to get an event since it isn't suppose to get one yet...
>
> The kernel isn't buggy wrt closing a fd when another thread is using
> it, it's the program that's buggy...

I agree also, but hanging without return isn't very detectable.
The best thing to do is to tell the programmer that he is doing
something stupid, and returning with an error is the way that
it is typically done.  Solaris seems to have jumped through
some hoops to achieve this behavior, so I doubt it is without
merit.  OTOH, I'm not going to argue that it is one of the
more important things we should be worried about ;-)

-- 
DE



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.64.0612212227410.2250>