Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 07 Oct 2009 15:27:35 +0500
From:      rihad <rihad@mail.ru>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        freebsd-net@freebsd.org, Eugene Grosbein <eugen@kuzbass.ru>, Luigi Rizzo <rizzo@iet.unipi.it>, Julian Elischer <julian@elischer.org>
Subject:   Re: dummynet dropping too many packets
Message-ID:  <4ACC6D17.7000405@mail.ru>
In-Reply-To: <4ACC5DEC.1010006@mail.ru>
References:  <4AC9E29B.6080908@mail.ru> <20091005123230.GA64167@onelab2.iet.unipi.it> <4AC9EFDF.4080302@mail.ru> <4ACA2CC6.70201@elischer.org> <4ACAFF2A.1000206@mail.ru> <4ACB0C22.4000008@mail.ru> <20091006100726.GA26426@svzserv.kemerovo.su> <4ACB42D2.2070909@mail.ru> <20091006142152.GA42350@svzserv.kemerovo.su> <4ACB6223.1000709@mail.ru> <20091006161240.GA49940@svzserv.kemerovo.su> <alpine.BSF.2.00.0910061804340.50283@fledge.watson.org> <4ACC5563.602@mail.ru> <4ACC56A6.1030808@mail.ru> <alpine.BSF.2.00.0910070957430.58146@fledge.watson.org> <4ACC5DEC.1010006@mail.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
rihad wrote:

> Now the probability of drops (as monitored by netstat -s's "output 
> packets dropped due to no bufs, etc.") is definitely a function of 
> traffic load and the number of items in a ipfw table. I've just 
> decreased the size of the two tables from ~2600 to ~1800 each and the 
> drops instantly went away, even though the traffic passing through the 
> box didn't decrease, it even increased a bit due to now shaping fewer 
> clients (luckily "ipfw pipe tablearg" passes packets failing a table 
> lookup untouched).
>
~2100 users in each of the two tables,
Drops have started coming in but at a a very very slow rate, like 30-150 
in a single burst every 10-20 minutes.



Run every 10 seconds:
$ while :; do netstat -s 2>/dev/null | fgrep -w "output packets 
dropped"; sleep 10; done
         30900 output packets dropped due to no bufs, etc.
	... 250-300 lines skipped
         30923 output packets dropped due to no bufs, etc.
	... 50-100 lines skipped
         30953 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31165 output packets dropped due to no bufs, etc.
         31444 output packets dropped due to no bufs, etc.
         31444 output packets dropped due to no bufs, etc.
         31444 output packets dropped due to no bufs, etc.
         31549 output packets dropped due to no bufs, etc.
         31549 output packets dropped due to no bufs, etc.
         31549 output packets dropped due to no bufs, etc.
         31549 output packets dropped due to no bufs, etc.
         31549 output packets dropped due to no bufs, etc.
         31549 output packets dropped due to no bufs, etc.
         31678 output packets dropped due to no bufs, etc.
         31678 output packets dropped due to no bufs, etc.
         31678 output packets dropped due to no bufs, etc.
         31678 output packets dropped due to no bufs, etc.
         31678 output packets dropped due to no bufs, etc.
         31678 output packets dropped due to no bufs, etc.
         31678 output packets dropped due to no bufs, etc.
         31678 output packets dropped due to no bufs, etc.

So the larger the number of users (increased at about 1-2 every 10 
seconds as users log in and out) the shorter the pause between the bursts.

net.isr.direct=0
top -SH:
last pid:  2528;  load averages:  0.69,  0.89,  0.96 
                             up 1+02:15:20  15:26:01
165 processes: 12 running, 137 sleeping, 16 waiting
CPU:  9.5% user,  0.0% nice,  3.8% system,  6.9% interrupt, 79.9% idle
Mem: 1726M Active, 1453M Inact, 433M Wired, 178M Cache, 214M Buf, 145M Free
Swap: 2048M Total, 12K Used, 2048M Free

   PID USERNAME   PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
    12 root       171 ki31     0K    16K RUN    6  24.8H 100.00% idle: cpu6
    11 root       171 ki31     0K    16K CPU7   7  24.7H 98.29% idle: cpu7
    13 root       171 ki31     0K    16K CPU5   5  24.3H 98.19% idle: cpu5
    14 root       171 ki31     0K    16K CPU4   4  23.9H 95.41% idle: cpu4
    15 root       171 ki31     0K    16K CPU3   3  23.3H 93.55% idle: cpu3
    16 root       171 ki31     0K    16K CPU2   2  23.4H 87.06% idle: cpu2
    18 root       171 ki31     0K    16K CPU0   0  21.1H 86.72% idle: cpu0
    29 root       -68    -     0K    16K CPU1   1 537:45 47.61% irq256: bce0
    17 root       171 ki31     0K    16K RUN    1 948:22 43.12% idle: cpu1
    19 root       -32    -     0K    16K WAIT   4  53:10  4.25% swi4: 
clock sio
    31 root       -68    -     0K    16K WAIT   2  58:44  3.86% irq257: bce1
   465 root       -68    -     0K    16K CPU3   3  59:02  1.51% dummynet
    21 root       -44    -     0K    16K WAIT   0  34:58  0.00% swi1: net
     3 root        -8    -     0K    16K -      0   8:15  0.00% g_up


Dummynet's WCPU is mostly 0-4%, but might jump to 6-12% sometimes, 
depending on which fraction of the second you look at it.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4ACC6D17.7000405>