From owner-freebsd-net@FreeBSD.ORG Wed Oct 7 12:42:53 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C0ADC1065676; Wed, 7 Oct 2009 12:42:53 +0000 (UTC) (envelope-from rihad@mail.ru) Received: from mx40.mail.ru (mx40.mail.ru [94.100.176.54]) by mx1.freebsd.org (Postfix) with ESMTP id 4B2998FC19; Wed, 7 Oct 2009 12:42:53 +0000 (UTC) Received: from [217.25.27.27] (port=43022 helo=[217.25.27.27]) by mx40.mail.ru with asmtp id 1MvVr1-0002HP-00; Wed, 07 Oct 2009 16:42:51 +0400 Message-ID: <4ACC8CC8.8050403@mail.ru> Date: Wed, 07 Oct 2009 17:42:48 +0500 From: rihad User-Agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090706) MIME-Version: 1.0 To: Robert Watson References: <4AC9E29B.6080908@mail.ru> <20091005123230.GA64167@onelab2.iet.unipi.it> <4AC9EFDF.4080302@mail.ru> <4ACA2CC6.70201@elischer.org> <4ACAFF2A.1000206@mail.ru> <4ACB0C22.4000008@mail.ru> <20091006100726.GA26426@svzserv.kemerovo.su> <4ACB42D2.2070909@mail.ru> <20091006142152.GA42350@svzserv.kemerovo.su> <4ACB6223.1000709@mail.ru> <20091006161240.GA49940@svzserv.kemerovo.su> <4ACC5563.602@mail.ru> <4ACC56A6.1030808@mail.ru> <4ACC5DEC.1010006@mail.ru> <4ACC65A0.7030900@mail.ru> In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Spam: Not detected X-Mras: Ok Cc: freebsd-net@freebsd.org, Eugene Grosbein , Luigi Rizzo , Julian Elischer Subject: Re: dummynet dropping too many packets X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Oct 2009 12:42:53 -0000 Robert Watson wrote: > Suggestions like increasing timer resolution are intended to spread out > the injection of packets by dummynet to attempt to reduce the peaks of > burstiness that occur when multiple queues inject packets in a burst > that exceeds the queue depth supported by combined hardware descriptor > rings and software transmit queue. > Raising HZ from 1000 to 2000 has helped. There are now 200-300 global drops/s, as opposed to 300-1000 with HZ=1000. Or maybe net.isr.direct from 1 to 0 help. Or maybe hash_size from 64 to 256. Or maybe... > The two solutions, then are (a) to increase the timer resolution > significantly so that packets are injected in smaller bursts But isn't that bad that it can actually become worse? From /sys/conf/NOTES: # The granularity of operation is controlled by the kernel option HZ whose # default value (1000 on most architectures) means a granularity of 1ms # (1s/HZ). Historically, the default was 100, but finer granularity is # required for DUMMYNET and other systems on modern hardware. There are # reasonable arguments that HZ should, in fact, be 100 still; consider, # that reducing the granularity too much might cause excessive overhead in # clock interrupt processing, potentially causing ticks to be missed and thus # actually reducing the accuracy of operation. > and (b) increase the queue capacities. The hardware queue limits likely can't > be raised w/o new hardware, but the ifnet transmit queue sizes can be > increased. Can someone please say how to increase the "ifnet transmit queue sizes"? > Timer resolution going up is almost certainly not a bad idea in your configuration, although does require a reboot as you have observed. > OK, I'll try HZ=4000, but there are some required servers like flowtools/radius/mysql/perl app that are also running. > On a side note: one other possible interpretation of that statistic is > that you're seeing fragmentation problems. Usually in forwarding > scenarios this is unlikely. However, it wouldn't hurt to make sure you > have LRO turned off on the network interfaces you're using, assuming > it's supported by the driver. > I don't think fragments are the problem. The numbers are too small ;-) $ netstat -s|fgrep fragment 5318 fragments received 147 fragments dropped (dup or out of space) 5157 fragments dropped after timeout 4088 output datagrams fragmented 8180 fragments created 0 datagrams that can't be fragmented There's no such option as LRO shown, so I guess it's off: options=1bb