Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 07 Oct 2009 17:42:48 +0500
From:      rihad <rihad@mail.ru>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        freebsd-net@freebsd.org, Eugene Grosbein <eugen@kuzbass.ru>, Luigi Rizzo <rizzo@iet.unipi.it>, Julian Elischer <julian@elischer.org>
Subject:   Re: dummynet dropping too many packets
Message-ID:  <4ACC8CC8.8050403@mail.ru>
In-Reply-To: <alpine.BSF.2.00.0910071312420.58146@fledge.watson.org>
References:  <4AC9E29B.6080908@mail.ru> <20091005123230.GA64167@onelab2.iet.unipi.it> <4AC9EFDF.4080302@mail.ru> <4ACA2CC6.70201@elischer.org> <4ACAFF2A.1000206@mail.ru> <4ACB0C22.4000008@mail.ru> <20091006100726.GA26426@svzserv.kemerovo.su> <4ACB42D2.2070909@mail.ru> <20091006142152.GA42350@svzserv.kemerovo.su> <4ACB6223.1000709@mail.ru> <20091006161240.GA49940@svzserv.kemerovo.su> <alpine.BSF.2.00.0910061804340.50283@fledge.watson.org> <4ACC5563.602@mail.ru> <4ACC56A6.1030808@mail.ru> <alpine.BSF.2.00.0910070957430.58146@fledge.watson.org> <4ACC5DEC.1010006@mail.ru> <alpine.BSF.2.00.0910071036280.58146@fledge.watson.org> <4ACC65A0.7030900@mail.ru> <alpine.BSF.2.00.0910071312420.58146@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Robert Watson wrote:
> Suggestions like increasing timer resolution are intended to spread out 
> the injection of packets by dummynet to attempt to reduce the peaks of 
> burstiness that occur when multiple queues inject packets in a burst 
> that exceeds the queue depth supported by combined hardware descriptor 
> rings and software transmit queue.
> 
Raising HZ from 1000 to 2000 has helped. There are now 200-300 global 
drops/s, as opposed to 300-1000 with HZ=1000. Or maybe net.isr.direct 
from 1 to 0 help. Or maybe hash_size from 64 to 256. Or maybe...

> The two solutions, then are (a) to increase the timer resolution 
> significantly so that packets are injected in smaller bursts

But isn't that bad that it can actually become worse?  From /sys/conf/NOTES:

# The granularity of operation is controlled by the kernel option HZ whose
# default value (1000 on most architectures) means a granularity of 1ms
# (1s/HZ).  Historically, the default was 100, but finer granularity is
# required for DUMMYNET and other systems on modern hardware.  There are
# reasonable arguments that HZ should, in fact, be 100 still; consider,
# that reducing the granularity too much might cause excessive overhead in
# clock interrupt processing, potentially causing ticks to be missed and 
thus
# actually reducing the accuracy of operation.


> and (b) increase the queue capacities.  The hardware queue limits likely can't 
> be raised w/o new hardware, but the ifnet transmit queue sizes can be 
> increased.

Can someone please say how to increase the "ifnet transmit queue sizes"?

> Timer resolution going up is almost certainly not a bad idea  in your configuration, although does require a reboot as you have observed.
> 
OK, I'll try HZ=4000, but there are some required servers like 
flowtools/radius/mysql/perl app that are also running.

> On a side note: one other possible interpretation of that statistic is 
> that you're seeing fragmentation problems.  Usually in forwarding 
> scenarios this is unlikely.  However, it wouldn't hurt to make sure you 
> have LRO turned off on the network interfaces you're using, assuming 
> it's supported by the driver.
> 
I don't think fragments are the problem. The numbers are too small ;-)
$ netstat -s|fgrep fragment
         5318 fragments received
         147 fragments dropped (dup or out of space)
         5157 fragments dropped after timeout
         4088 output datagrams fragmented
         8180 fragments created
         0 datagrams that can't be fragmented

There's no such option as LRO shown, so I guess it's off:
options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4ACC8CC8.8050403>