Date: Fri, 26 Jul 2019 17:57:21 +0100 From: Paul Macdonald <paul@ifdnrg.com> To: freebsd-questions@freebsd.org Subject: Re: Help:: Listen queue overflow killing servers Message-ID: <2798d3f3-9689-111c-e061-1f6f66d78e03@ifdnrg.com> In-Reply-To: <2b10f991-bc95-ae31-18e2-95ae943ac527@holgerdanske.com> References: <3a62375a-432c-3533-a7bc-e5573c26fa9c@ifdnrg.com> <2b10f991-bc95-ae31-18e2-95ae943ac527@holgerdanske.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 26/07/2019 17:11, David Christensen wrote: > On 7/26/19 4:58 AM, Paul Macdonald via freebsd-questions wrote: >> Over the past few months i've seen several boxes (4 or 5) become >> unresponsive as a result of a Listen queue overflow state. > >> All are on ZFS and are std apache/php/mysql servers with nothing too >> exotic. > >> /var/log/messages shows tyically; >> >> kernel: sonewconn: pcb 0xfffff813395e3d58: Listen queue >> overflow: 193 already in queue awaiting acceptance (83 occurrences) >> >> netstat -Lan shows >> >> tcp4 193/0/128 x.x.x.x.443 >> tcp4 193/0/128 x.x.x.x.80 > > > What Apache/ PHP/ MySQL applications? Did you write them? If not, > who did? Is everything up to date? Have you filed bug reports? > > > Do the applications have logging or debugging capabilities? Have you > enabled them? What do they say? Where is the blockage? Deadlock? > > These were on servers with multiple vhosts, often running wordpress , but in one instance not ( which had custom software we wrote inhouse , but thats been in production for 19 years without this issue!) I suspect it's too low level for application level debugging, all i know so far is: - servers become unresponsive, Listen queue overflow messages in /var/log/messages - unable to quit jails or even shutdown, tcpdrop doesn't work (everything in CLOSE_WAIT) - On the occasion today ( and i can;t be 100% sure, but i siuspect always) , all the apache processes were in disk wait state, but this was on a big new box, with a very tiny site, ( on NVMe) All servers on FBSD12, with zfs and apache is within an (ezjail) Multiple load patterns, but 2 out of the 5ish issues don't make much sense as theere would have been very little load. Non reproducible, have sieged a couple of the affected boxes with no effect ( and logs on a couple of boxes show no intersting traffic, just normal) - siege -c 255 -r 2 (pretty stressful) (target server does now something in netstat queues , 0-100/512 but apache stays out of disk wait , siege is (un) sucessfull as target copes fine run multiple times , no problem, and have now generated about 100,000 lines more in apache log that i saw after the server went down today ( (6600 hits to a 16C/32T + 128GB + NVme machine went down with this earlier) I've just hit it with 255 concurrent users over a period of 20 mins, and it doesn;t blink so doesn;t look like its load..... ( and that would have shown up in the logs anyway) > David > _______________________________________________ > freebsd-questions@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to > "freebsd-questions-unsubscribe@freebsd.org" > -- ------------------------- Paul Macdonald IFDNRG Ltd Web and video hosting ------------------------- t: 0131 5548070 m: 07970339546 e: paul@ifdnrg.com w: http://www.ifdnrg.com ------------------------- IFDNRG 40 Maritime Street Edinburgh EH6 6SA ---------------------------------------------------- Virtual Servers from £50.00pm High specification Dedicated Servers from £150.00pm ----------------------------------------------------
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2798d3f3-9689-111c-e061-1f6f66d78e03>