From owner-freebsd-net@FreeBSD.ORG Sun Oct 4 15:08:40 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9517910656C6 for ; Sun, 4 Oct 2009 15:08:40 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 2381D8FC26 for ; Sun, 4 Oct 2009 15:08:39 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 0A0C7730DA; Sun, 4 Oct 2009 17:15:18 +0200 (CEST) Date: Sun, 4 Oct 2009 17:15:18 +0200 From: Luigi Rizzo To: rihad Message-ID: <20091004151518.GB42877@onelab2.iet.unipi.it> References: <4AC8A76B.3050502@mail.ru> <20091004144909.GA42503@onelab2.iet.unipi.it> <4AC8B6E3.2070101@mail.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4AC8B6E3.2070101@mail.ru> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org Subject: Re: dummynet dropping too many packets X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 15:08:40 -0000 On Sun, Oct 04, 2009 at 07:53:23PM +0500, rihad wrote: > Luigi Rizzo wrote: > >On Sun, Oct 04, 2009 at 06:47:23PM +0500, rihad wrote: > >>Hi, we have around 500-600 mbit/s traffic flowing through a 7.1R Dell > >>PowerEdge w/ 2 GigE bce cards. There are currently around 4 thousand ISP > >>users online limited by dummynet pipes of various speeds. According to > >>netstat -s output around 500-1000 packets are being dropped every second > >>(this accounts for wasting around 7-12 mbit/s worth of traffic according > >>to systat -ifstat): > > > >what kind of packets are you seeing as dropped ? > >please give the output of 'netstat -s output | grep drop' > > > The output packets, like I said: > # netstat -s output | grep drop > 2 connections closed (including 2 drops) > 0 embryonic connections dropped > 2 connections dropped by rexmit timeout > 0 connections dropped by persist timeout > 0 Connections (fin_wait_2) dropped because of timeout > 0 connections dropped by keepalive > 0 dropped > 2 dropped due to no socket > 0 dropped due to full socket buffers > 0 fragments dropped (dup or out of space) > 2 fragments dropped after timeout > 7538 output packets dropped due to no bufs, etc. > > The statistics are zeroed every 15 seconds in another window as I'm > investigating the issue, but the rate is around 500-1000 lost packets > every second at the current ~530 mbit/s load. > > >At those speeds you might be hitting various limits with your > >config (e.g. 50k nmbclusters is probably way too small for > > I bet it isn't: > 1967/5009/6976/50000 mbuf clusters in use (current/cache/total/max) > > >4k users -- means you have an average of 10-15 buffers per user; > >the queue size of 350kbytes = 2.6Mbits means 2.6 seconds of buffering, > >which is quite high besides the fact that in order to scale to 4k users > >you would need over 1GB of kernel memory just for the buffers). > Aha. Can you be more specific about the kernel memory stuff? Which > setting needs tweaking? > > > I have another similar box with 2 em GigE interfaces working @220-230 > mbit/s and virtually no out-of-bufs dropped packets as with bce @500-600 > mbit. It too has 350KBytes dummynet queue sizes. And it too has adequate > mbuf load: > 3071/10427/13498/25600 mbuf clusters in use (current/cache/total/max) I think a quick way to tell if the problem is in dummynet/ipfw or elsewhere would be to reconfigure the pipes (for short times, e.g. 1-2 minutes while you test things) as # first, try to remove the shaping to see if the drops # are still present or not ipfw pipe XX delete; ipfw pipe XX config // no buffering # second, do more traffic aggregation to see if the number of # pipes influences the drops. These are several different # configs to be tried. ipfw pipe XX delete; ipfw pipe XX config bw 500Mbits/s ipfw pipe XX delete; ipfw pipe XX config bw 50Mbits/s mask src-ip 0xffffff00 ipfw pipe XX delete; ipfw pipe XX config bw 5Mbits/s mask src-ip 0xfffffff0 and see if things change. If losses persist even removing dummynet, then of course it is a device problem. Also note that dummynet introduces some burstiness in the output, which might saturate the output queue in the card (no idea what is used by bce). This particular phenomenon could be reduced by raising HZ to 2000 or 4000. cheers luigi