From owner-freebsd-net@FreeBSD.ORG Sat Nov 10 09:13:20 2007 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1EA816A468 for ; Sat, 10 Nov 2007 09:13:20 +0000 (UTC) (envelope-from mattjreimer@gmail.com) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.182]) by mx1.freebsd.org (Postfix) with ESMTP id 68DE913C4B5 for ; Sat, 10 Nov 2007 09:13:20 +0000 (UTC) (envelope-from mattjreimer@gmail.com) Received: by py-out-1112.google.com with SMTP id u77so138491pyb for ; Sat, 10 Nov 2007 01:13:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=R4LiWfTo+A1f8ADYr6khO9N9wR2Fz9xV9vu0F7G4Hzs=; b=ravl5lta4rm7Y7SAbNY8C15p+LP/UGzxO+l7Q27HXCSEj9ZncgyT9mQsK5osHLxjlRbxFwMjWF99quAUTRPxfZKXvNt9f0+AsVuIRoEhZ4l04OEWsfCINtXMwaSd4l80l4VnysP40tEXZhgekosxfdFGkDWhrOaMOMFCR3RUUXo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=bWjzJOWgy6fYxPOe3QyvxbIIaTOomWpwDVXvn1YWr4xq6YQM0Aogye9UOw0TJAH979FoaTpYtp3SXPnzRJ7vK2ss+AkzU0w+ckfKG3gmy3n5Tp+yMVyFnmWhxgZQTQV7a/0rTzyGZSgDXrFmrOnipZ7cfB9K+Zq2hDP28QIIpsU= Received: by 10.35.69.11 with SMTP id w11mr3147948pyk.1194685991663; Sat, 10 Nov 2007 01:13:11 -0800 (PST) Received: by 10.35.67.16 with HTTP; Sat, 10 Nov 2007 01:13:11 -0800 (PST) Message-ID: Date: Sat, 10 Nov 2007 01:13:11 -0800 From: "Matt Reimer" To: "Mike Silbersack" In-Reply-To: <20071110020333.I46803@odysseus.silby.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20071109213846.O46803@odysseus.silby.com> <20071110020333.I46803@odysseus.silby.com> Cc: net@freebsd.org Subject: Re: Should syncache.count ever be negative? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Nov 2007 09:13:20 -0000 On Nov 10, 2007 12:13 AM, Mike Silbersack wrote: > > On Fri, 9 Nov 2007, Matt Reimer wrote: > > > I first noticed this problem running ab; then to simplify I used > > netrate/http[d]. What's strange is that it seems fine over the local > > network (~15800 requests/sec), but it slowed down dramatically (~150 > > req/sec) when tested from another network 20 ms away. Running systat > > -tcp and nload I saw that there was an almost complete stall with only > > a handful of packets being sent (probably my ssh packets) for a few > > seconds or sometimes even up to 60 seconds or so. > > I think most benchmarking tools end up stalling if all of their threads > stall, that may be why the rate falls off after the misbehavior you > describe below begins. Ok. FWIW, I'm seeing the same behavior with tools/netrate/http as I am with ab. > > Nov 9 19:02:34 wordpress1 kernel: TCP: [207.210.67.2]:64851 to > > [66.230.193.105]:80; syncache_socket: Socket create failed due to > > limits or memory shortage > > Nov 9 19:02:34 wordpress1 kernel: TCP: [207.210.67.2]:64851 to > > [66.230.193.105]:80 tcpflags 0x10; tcp_input: Listen socket: > > Socket allocation failed due to limits or memory shortage, sending RST > > Turns out you'll generally get both of those error messages together, from > my reading of the code. > > Since you eliminated memory shortage in the socket zone, the next thing to > check is the length of the listen queues. If the listen queue is backing > up because the application isn't accepting fast enough, the errors above > should happen. "netstat -Lan" should show you what's going on there. > Upping the specified listen queue length in your webserver _may_ be all > that is necessary. Try fiddling with that and watching how much they're > filling up during testing. I ran "netstat -Lan" every second while running this test and the output never changed from the following, whether before or after the stall: Current listen queue sizes (qlen/incqlen/maxqlen) Proto Listen Local Address tcp4 0/0/128 66.230.193.105.80 tcp4 0/0/10 127.0.0.1.25 tcp4 0/0/128 *.22 tcp4 0/0/128 *.199 > The fact that you see the same port repeatedly may indicate that the > syncache isn't destroying the syncache entries when you get the socket > creation failure. Take a look at "netstat -n" and look for SYN_RECEIVED > entries - if they're sticking around for more than a few seconds, this is > probably what's happening. (This entire paragraph is speculation, but > worth investigating.) During the stall the sockets are all in TIME_WAIT. More relevant info: kern.ipc.maxsockets: 12328 kern.ipc.numopensockets: 46 net.inet.ip.portrange.randomtime: 45 net.inet.ip.portrange.randomcps: 10 net.inet.ip.portrange.randomized: 1 net.inet.ip.portrange.reservedlow: 0 net.inet.ip.portrange.reservedhigh: 1023 net.inet.ip.portrange.hilast: 65535 net.inet.ip.portrange.hifirst: 49152 net.inet.ip.portrange.last: 65535 net.inet.ip.portrange.first: 30000 net.inet.ip.portrange.lowlast: 600 net.inet.ip.portrange.lowfirst: 1023 net.inet.tcp.finwait2_timeout: 60000 net.inet.tcp.fast_finwait2_recycle: 0 [root@wordpress1 /sys/dev]# netstat -m 513/5382/5895 mbufs in use (current/cache/total) 511/3341/3852/25600 mbuf clusters in use (current/cache/total/max) 1/1663 mbuf+clusters out of packet secondary zone in use (current/cache) 0/488/488/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 1150K/9979K/11129K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 17 requests for I/O initiated by sendfile 0 calls to protocol drain routines > > I don't know if it's relevant, but accf_http is loaded on wordpress1. > > That may be relevant - accepting filtering changes how the listen queues > are used. Try going back to non-accept filtering for now. It still stalls. This time I noticed that tcptw shows 0 free: socket: 696, 12330, 14, 126, 10749, 0 unpcb: 248, 12330, 5, 70, 75, 0 ipq: 56, 819, 0, 0, 0, 0 udpcb: 280, 12334, 2, 40, 184, 0 inpcb: 280, 12334, 2485, 105, 10489, 0 tcpcb: 688, 12330, 7, 73, 10489, 0 tcptw: 88, 2478, 2478, 0, 2478, 7231 syncache: 112, 15378, 1, 65, 9713, 0 hostcache: 136, 15372, 0, 0, 0, 0 tcpreass: 40, 1680, 0, 0, 0, 0 sackhole: 32, 0, 0, 0, 0, 0 But even while tcptw shows 0 free, I can still blast 15800 req/s from another RELENG_7 box to this one during the stall. So I don't know if that means anything. > FWIW, my crazy theory of the moment is this: We have some bug that > happens when the listen queues overflow in 7.0, and your test is strenuous > enough to hit the listen queue overflow condition, leading to total > collapse. I'll have to cobble together a test program to see what happens > in the listen queue overflow case. When I use ab I'm telling it to use a max of 100 simultaneous connections (ab -c 100 -n 50000 http://66.230.193.105/). Wouldn't that be well under the limit? > Thanks for the quick feedback, Thank *you*. Matt