Date: Tue, 6 Dec 2005 22:34:43 +0300 (MSK) From: Igor Sysoev <is@rambler-co.ru> To: John-Mark Gurney <gurney_j@resnet.uoregon.edu> Cc: freebsd-net@freebsd.org Subject: Re: strange timeout error returned by kevent() in 6.0 Message-ID: <20051206222847.Y73245@is.park.rambler.ru> In-Reply-To: <20051206183648.GG55657@funkthat.com> References: <20050901140051.G11484@is.park.rambler.ru> <20050901182115.F11484@is.park.rambler.ru> <20051206183648.GG55657@funkthat.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 6 Dec 2005, John-Mark Gurney wrote: > Igor Sysoev wrote this message on Thu, Sep 01, 2005 at 18:26 +0400: >> On Thu, 1 Sep 2005, Igor Sysoev wrote: >> >>> I found strange timeout errors returned by kevent() in 6.0 using >>> my http server named nginx. The nginx's run on three machines: >>> two 4.10-RELEASE and one 6.0-BETA3. All machines serve the same >>> content (simple cluster) and each handles about 200 requests/second. >>> >>> On 6.0 sometimes (2 or 3 times per hour) in the daytime kevent() >>> returns EV_EOF in flags and ETIMEDOUT in fflags, nevertheless: >>> >>> 1) nginx does not set any kernel timeout for sockets; >>> 2) the total request time for such failed requests is small, 30 and so >>> seconds. >> >> I have changed code to ignore the ETIMEDOUT error returned by kevent() >> and found that subsequent sendfile() returned the ENOTCONN. >> >> By the way, why sendfile() may return ENOTCONN ? >> I saw this error code on 4.x too. > > The reason that you are seeing ETIMEDOUT/ENOTCONN is that the connection > probably ETIMEDOUT (aka timed out)... and so is ENOTCONN (no longer > connected).. can you also do a read or a write to the socket successfully? At least recv() returns ETIMEDOUT. I could not test write() right now. > and sendfile(3) says: > ERRORS > [...] > > [ENOTCONN] The s argument points to an unconnected socket. > > and a glance at tcp(4) says: > ERRORS > [...] > > [ETIMEDOUT] when a connection was dropped due to excessive > retransmissions; > > There's the answers... Yes, it seems that ETIMEDOUT is retransmission failure. I've seen it in experiment. The strangeness is that I did not see this error on 4.10. Only on 6.0 and recenty on 4.11. May be I will upgrade cluster machine from 4.10 to 4.11 to see changes. Igor Sysoev http://sysoev.ru/en/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051206222847.Y73245>