Date: Thu, 28 Aug 2008 06:10:59 -0700 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: Steven Hartland <killing@multiplay.co.uk> Cc: freebsd-hackers@freebsd.org Subject: Re: lighttpd failing to accept new connections ( connection reset ) Message-ID: <20080828131059.GA46853@icarus.home.lan> In-Reply-To: <A4FCC80B7CC742C393346F1FFE7AA18F@multiplay.co.uk> References: <A4FCC80B7CC742C393346F1FFE7AA18F@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 28, 2008 at 01:13:57PM +0100, Steven Hartland wrote: > We're using lighttpd here for a new project and we're having issues > where by it simply stops processing after a 1-2 days. > > Having looked at it in some detail this morning it seems that > the kernel is resetting the connection without notifying the > lighttpd process there is a new connection attempt. I assume > that the listen queue is full but why kevent is not notifying > lighttpd that there are outstanding events is beyond me. > > > The following is a truss of the process which is currently in > this state:- > kevent(6,0x0,0,{},11096,{1.000000000}) = 0 (0x0) > gettimeofday({1219920575.149428},0x0) = 0 (0x0) > kevent(6,0x0,0,{},11096,{1.000000000}) = 0 (0x0) > gettimeofday({1219920576.150443},0x0) = 0 (0x0) > > ktrace of the operation as well:- > 28363 lighttpd RET kevent 0 > 28363 lighttpd CALL gettimeofday(0x7fffffffeb20,0) > 28363 lighttpd RET gettimeofday 0 > 28363 lighttpd CALL kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20) > 28363 lighttpd GIO fd 6 wrote 0 bytes > "" > 28363 lighttpd GIO fd 6 read 0 bytes > "" > 28363 lighttpd RET kevent 0 > 28363 lighttpd CALL gettimeofday(0x7fffffffeb20,0) > 28363 lighttpd RET gettimeofday 0 > 28363 lighttpd CALL kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20) > 28363 lighttpd GIO fd 6 wrote 0 bytes > "" > 28363 lighttpd GIO fd 6 read 0 bytes > "" > 28363 lighttpd RET kevent 0 > 28363 lighttpd CALL gettimeofday(0x7fffffffeb20,0) > 28363 lighttpd RET gettimeofday 0 > 28363 lighttpd CALL kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20) > 28363 lighttpd GIO fd 6 wrote 0 bytes > "" > 28363 lighttpd GIO fd 6 read 0 bytes > "" > 28363 lighttpd RET kevent 0 > 28363 lighttpd CALL gettimeofday(0x7fffffffeb20,0) > 28363 lighttpd RET gettimeofday 0 > 28363 lighttpd CALL kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20) > 28363 lighttpd GIO fd 6 wrote 0 bytes > "" > 28363 lighttpd GIO fd 6 read 0 bytes > "" > 28363 lighttpd RET kevent 0 > 28363 lighttpd CALL gettimeofday(0x7fffffffeb20,0) > 28363 lighttpd RET gettimeofday 0 > 28363 lighttpd CALL kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20) > 28363 lighttpd GIO fd 6 wrote 0 bytes > "" > 28363 lighttpd GIO fd 6 read 0 bytes > "" > 28363 lighttpd RET kevent 0 > 28363 lighttpd CALL gettimeofday(0x7fffffffeb20,0) > 28363 lighttpd RET gettimeofday 0 > 28363 lighttpd CALL kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffffffeb20) > > > tcpdump shows:- > 12:10:29.475255 IP (tos 0x10, ttl 64, id 9536, offset 0, flags [DF], > proto: TCP (6), length: 64) client.61224 > server.80: S, cksum 0x6d22 > (incorrect (-> 0xedfa), 291994449:291994449(0) win 65535 <mss > 1460,nop,wscale 1,nop,nop,timestamp 3661727139 0,sackOK,eol> > 12:10:29.481396 IP (tos 0x0, ttl 61, id 25503, offset 0, flags [DF], > proto: TCP (6), length: 60) server.80 > client.61224: S, cksum 0xbf22 > (correct), 3444532576:3444532576(0) ack 291994450 win 65535 <mss > 1460,nop,wscale 9,sackOK,timestamp 3136311843 3661727139> > 12:10:29.481419 IP (tos 0x10, ttl 64, id 9538, offset 0, flags [DF], > proto: TCP (6), length: 52) client.61224 > server.80: ., cksum 0x6d16 > (incorrect (-> 0x6bd2), 1:1(0) ack 1 win 33304 <nop,nop,timestamp > 3661727145 3136311843> > 12:10:29.487519 IP (tos 0x10, ttl 61, id 25504, offset 0, flags [DF], > proto: TCP (6), length: 40) server.80 > client.61224: R, cksum 0x20c7 > (correct), 3444532577:3444532577(0) win 0 > > This may have been raised before back 2003 as bug kern/57380 > but it was closed after no response from the reporter. > > Another possible issues related to this is:- > http://trac.lighttpd.net/trac/ticket/1734 > > > I've currently got one of the production machines offline > with this error ( hence the important flag ) in the hope > that someone can suggest a test which will shed more light > on the issue before I restart it. Can you change the polling method in lighttpd to use poll or select instead of kqueue? This would help in determining if the problem is with the daemon itself or the kevent system. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080828131059.GA46853>