Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 17 Nov 2007 10:10:53 +0300
From:      Igor Sysoev <is@rambler-co.ru>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        freebsd-net@FreeBSD.org
Subject:   Re: bge loader tunables
Message-ID:  <20071117071053.GA18091@rambler-co.ru>
In-Reply-To: <20071117065908.T65479@delplex.bde.org>
References:  <20071116154019.GE93422@rambler-co.ru> <20071117065908.T65479@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Nov 17, 2007 at 08:30:58AM +1100, Bruce Evans wrote:

> On Fri, 16 Nov 2007, Igor Sysoev wrote:
> 
> >The attached patch creates the following bge loader tunables:
> 
> I plan to commit old work to do this using sysctls.  Tunables are
> harder to use and aren't needed since changes to the defaults aren't
> needed for booting.  I also implemented dynamic tuning for rx coal
> parameters so that the sysctls are mostly not needed.  Ask for patches
> if you want to test this extensively.

Yes, I can test your patches on 6.2 and 7.0.
Now bge set the coalescing parameters at attach time.
Do the sysctl's allow to change them on-the-fly ?
How does rx dynamic tuning work ?
Could it be turned off ?

> >hw.bge.rxd=512
> >
> >Number of standard receive descriptors allocated by the driver.
> >The default value is 256. The maximum value is 512.
> 
> I always use 512 for this.  The corresponding value for jumbo buffers
> is hard-coded (JSLOTS exists to tune the value at config time, like
> SSLOTS does for this, but is no longer used).  Only machines with a
> small amount of memory should care about the wastage from always
> allocating the max number of descriptors.

I agree: the default jumbo rx ring takes 256*9216=2.3M, while maximum
standard rx ring takes 512*2048=1M, nevertheless it is limited to
256*2048=512K.

> >hw.bge.rx_int_delay=500
> >
> >This value delays the generation of receive interrupts in microseconds.
> >The default value is 150 microseconds.
> 
> This is a good default.  I normally use 100 (goes with dynamic tuning to
> limit the rx interrupt rate to 10 kHz).
> 
> >hw.bge.tx_int_delay=500
> >
> >This value delays the generation of transmit interrupts in microseconds.
> >The default value is 150 microseconds.
> 
> I use 1 second.  Infinity works right, except it wastes mbufs when the
> tx is idle for a long time.

It seems 1 second is good for me: I use sendfile() and lot of mbufs clusters:
kern.ipc.nmbclusters=196608

> >hw.bge.rx_coal_desc=64
> >
> >This value delays the generation of receive interrupts until specified
> >number of packets will be received. The default value is 10.
> 
> 64 is a good default.  10 is a bad default (it optimizes too much for
> latency at a cost of efficiency to be good). I use 1 when optimizing
> for latency.  Dynamic tuning sets this to a value suitable for limiting
> the rx interrupt rate to a specified frequency (10 kHz is a good limit).
> 
> >hw.bge.tx_coal_desc=128
> >
> >This value delays the generation of transmit interrupts until specified
> >number of packets will be transmited. The default value is 10.
> 
> 128 is a good default.  I use 384.  There are few latency issues here, so
> the default of 10 mainly costs efficiency.

Does 384 not delay tx if there is shortage of free tx descriptors ?


-- 
Igor Sysoev
http://sysoev.ru/en/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071117071053.GA18091>