Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Dec 2005 22:34:43 +0300 (MSK)
From:      Igor Sysoev <is@rambler-co.ru>
To:        John-Mark Gurney <gurney_j@resnet.uoregon.edu>
Cc:        freebsd-net@freebsd.org
Subject:   Re: strange timeout error returned by kevent() in 6.0
Message-ID:  <20051206222847.Y73245@is.park.rambler.ru>
In-Reply-To: <20051206183648.GG55657@funkthat.com>
References:  <20050901140051.G11484@is.park.rambler.ru> <20050901182115.F11484@is.park.rambler.ru> <20051206183648.GG55657@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 6 Dec 2005, John-Mark Gurney wrote:

> Igor Sysoev wrote this message on Thu, Sep 01, 2005 at 18:26 +0400:
>> On Thu, 1 Sep 2005, Igor Sysoev wrote:
>>
>>> I found strange timeout errors returned by kevent() in 6.0 using
>>> my http server named nginx.  The nginx's run on three machines:
>>> two 4.10-RELEASE and one 6.0-BETA3.  All machines serve the same
>>> content (simple cluster) and each handles about 200 requests/second.
>>>
>>> On 6.0 sometimes (2 or 3 times per hour) in the daytime kevent()
>>> returns EV_EOF in flags and ETIMEDOUT in fflags, nevertheless:
>>>
>>> 1) nginx does not set any kernel timeout for sockets;
>>> 2) the total request time for such failed requests is small, 30 and so
>>> seconds.
>>
>> I have changed code to ignore the ETIMEDOUT error returned by kevent()
>> and found that subsequent sendfile() returned the ENOTCONN.
>>
>> By the way, why sendfile() may return ENOTCONN ?
>> I saw this error code on 4.x too.
>
> The reason that you are seeing ETIMEDOUT/ENOTCONN is that the connection
> probably ETIMEDOUT (aka timed out)... and so is ENOTCONN (no longer
> connected).. can you also do a read or a write to the socket successfully?

At least recv() returns ETIMEDOUT. I could not test write() right now.

> and sendfile(3) says:
> ERRORS
> 	[...]
>
>     [ENOTCONN]         The s argument points to an unconnected socket.
>
> and a glance at tcp(4) says:
> ERRORS
> 	[...]
>
>     [ETIMEDOUT]        when a connection was dropped due to excessive
>                        retransmissions;
>
> There's the answers...

Yes, it seems that ETIMEDOUT is retransmission failure. I've seen it in
experiment.

The strangeness is that I did not see this error on 4.10.
Only on 6.0 and recenty on 4.11. May be I will upgrade cluster machine
from 4.10 to 4.11 to see changes.


Igor Sysoev
http://sysoev.ru/en/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051206222847.Y73245>