Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 04 Aug 2008 10:35:19 -0300
From:      Patrick Tracanelli <eksffa@freebsdbrasil.com.br>
To:        Mike Makonnen <mtm@wubethiopia.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: Application layer classifier for ipfw
Message-ID:  <48970597.3030407@freebsdbrasil.com.br>
In-Reply-To: <48945A79.50300@wubethiopia.com>
References:  <48918DB5.7020201@wubethiopia.com>	<4891CD13.20600@freebsdbrasil.com.br>	<48922E9D.1020507@elischer.org>	<4893328C.2040105@freebsdbrasil.com.br> <4894447C.3000800@wubethiopia.com> <48945A79.50300@wubethiopia.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Mike Makonnen escreveu:
> Mike Makonnen wrote:
>> Patrick Tracanelli wrote:
>>>
>>> To let you know of my current (real world) tests:
>>>
>>> - Wireless Internet Provider 1:
>>>     - 4Mbit/s of Internet Traffic
>>>     - Classifying default protocols + soulseek + ssh
>>>     - Classifying 100Mbit/s of dump over ssh
>>>
>>> Results in:
>>>     No latency added, very low CPU usage, no packets dropping.
>>>
>>> - Wireless ISP 2:
>>>     - 21 Mbit/s of Internet Traffic
>>>     - Classifying default protocols + soulseek + ssh
>>>
>>> Results in:
>>>     No tcp or udp traffic at all; everything that gets diverted never 
>>> comes out of the divert socket, and ipfw-classifyd logs
>>>
>>> Aug  1 12:07:35 ourofino last message repeated 58 times
>>> Aug  1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: bittorrent 
>>> (rule 50000)
>>> Aug  1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: edonkey 
>>> (rule 50000)
>>> Aug  1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: fasttrack 
>>> (rule 50000)
>>> Aug  1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: gnutella 
>>> (rule 1000)
>>> Aug  1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: soulseek 
>>> (rule 50000)
>>> Aug  1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: ssh   (rule 
>>> 50000)
>>> Aug  1 12:18:28 ourofino ipfw-classifyd: unable to write to divert 
>>> socket: Operation not permitted
>>> Aug  1 12:18:50 ourofino last message repeated 90 times
>>
>> Hmmm... this part means that the call to sendto(2) to write the packet 
>> back into network stack failed.  This explains why you are not seein g 
>> any traffic comming back out of the divert socket, but I don't see why 
>> it would suddenly fail with a permission error. Could this be a kernel 
>> bug?
>>> Aug  1 12:18:51 ourofino ipfw-classifyd: packet dropped: input queue 
>>> full
>>> Aug  1 12:19:11 ourofino last message repeated 94 times
>>>
>>> Raised queue len a lot (up to 40960), when the application starts it 
>>> uses up to 25% CPU and a second after that, CPU usage gets lower the 
>>> 0.1%.
>>
>> This looks like a deadlock. If it weren't able to process packets fast 
>> enough the cpu usage should be high even as it's spewing "packet 
>> dropped" messages. Can you send me some more information like memory 
>> usage and the firewall script you are using? How much of the 21Mbits/s 
>> of traffic is P2P? If you reduce the number of protocols you are 
>> trying to match against does the behavior change? Using netstat -w1 
>> -I<interface> can you tell me how many packets per second we're 
>> talking about for 4Mbits/s and 21Mbit/s? Also, the timestamps from the 
>> log file seem to show that the daemon is running for approx. 34 sec. 
>> before the first "unable to write to write to divert socket" message. 
>> Is it passing traffic during this time? Thanks.
>>
>> I've uploaded a newer version. Can you try that also please. It includes:
>>   o SIGHUP forces it to re-read its configuration file
>>   o rc.d script
>>   o minor optimization (calls pthread_cond_signal with the mutex 
>> unlocked)
>>   o code cleanup
>>
>> Also, for your convenience I have attached a patch against the earlier 
>> version that removes a debugging printf that should remove spammage to 
>> your log files (the current version has it removed already).
>>
> 
> Ooops, a few minutes after I sent this email I found a couple of bugs 
> (one major, and one minor). They were in the original tarball as well as 
> the newer one I uploaded earlier today. I've uploaded a fixed version of 
> the code. Can you try that instead please.
> 
> Also, to help track down performance issues I've modified the Makefile 
> to build a profiled version of the application so you can use gprof(1) 
> to figure out where any problems lie.
> 
> Cheers.
> 

On gcc  3.4.6, -Wno-pointer-sign compiler switch is unknown and 
therefore compiling fails:

cc -O2 -fno-strict-aliasing -pipe  -O2 -pipe -pg -g -Wsystem-headers 
-Werror -Wall -Wno-format-y2k -W  -Wno-unused-parameter 
-Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Wreturn-type 
  -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wcast-align 
-Wunused-parameter -Wchar-subscripts -Winline  -Wnested-externs 
-Wredundant-decls -Wno-pointer-sign -c classifyd.c
cc1: error: unrecognized command line option "-Wno-pointer-sign"
*** Error code 1

Stop in /usr/home/setup/ipfw-classifyd.

If removed, does compile fine.

On amd64 does not compile:

cc -O2 -fno-strict-aliasing -pipe  -O2 -pipe -pg -g -Wsystem-headers 
-Werror -Wall -Wno-
format-y2k -W  -Wno-unused-parameter -Wstrict-prototypes 
-Wmissing-prototypes -Wpointer
-arith -Wreturn-type  -Wcast-qual -Wwrite-strings -Wswitch -Wshadow 
-Wcast-align  -Wunus
ed-parameter -Wchar-subscripts -Winline  -Wnested-externs 
-Wredundant-decls -Wno-pointer
-sign -fPIC -pipe -c classifyd.c
cc1: warnings being treated as errors
classifyd.c: In function 'write_pthread':
classifyd.c:813: warning: format '%d' expects type 'int', but argument 4 
has type 'size_
t'
*** Error code 1

The rules Im running:

00001   1379    320702 fwd 192.168.100.11 ip from any to { 
201.91.17.200/29 or dst-ip 189.44.216.168/29 } out via fxp0
00002      0         0 fwd 192.168.100.11 ip from any to { 
201.91.17.200/29 or dst-ip 189.44.216.168/29 } out via fxp2
00003  39137   7760042 fwd 189.52.140.1 ip from 189.52.140.10 to any out 
via fxp0
00004  34462  10801535 fwd 189.52.140.3 ip from 189.52.140.11 to any out 
via fxp0
00005  68780  29374414 fwd 189.52.140.1 ip from 189.52.140.18 to any out 
via fxp0
00006  12154   2003678 fwd 189.52.140.3 ip from 189.52.140.19 to any out 
via fxp0
00500   4741    770166 divert 7777 tcp from any to any via fxp0
00501    172     38599 divert 7777 udp from any to any via fxp0
49999 823215 453482966 allow ip from any to any
65535      0         0 allow ip from any to any

Memory usage while running classifyd:

Mem: 67M Active, 27M Inact, 44M Wired, 24K Cache, 111M Buf, 859M Free
Swap: 1024M Total, 1024M Free

classifyd's CPU usage:

5198 root        3  20    0  6132K  5420K kserel   1:14  24.52% 
ipfw-classifyd

System load averga for 5 minutes is 0.2 while not running classifyd and 
0.8 while running.

Logs with the latest (downloaded a few minutes ago) code:

Aug  4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: bittorrent 
(rule 50000)
Aug  4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: edonkey (rule 
50000)
Aug  4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: fasttrack 
(rule 50000)
Aug  4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: gnutella (rule 
50000)
Aug  4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: soulseek (rule 
50000)
Aug  4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: ssh (rule 50000)
Aug  4 10:15:36 ourofino ipfw-classifyd: packet dropped: input queue full
Aug  4 10:16:07 ourofino last message repeated 1594 times
Aug  4 10:16:19 ourofino last message repeated 766 times

I miss the debug lines! ;D

# netstat -w1 -I fxp0
             input         (fxp0)           output
    packets  errs      bytes    packets  errs      bytes colls
        535     0     286009        515     0     165965     0
        416     0     347448        456     0     153585     0
        444     0     309008        449     0     185087     0
        493     0     190090        474     0     202637     0
        547     0     197882        562     0     233751     0
        409     0     288318        461     0     249421     0
        434     0     173193        464     0     261512     0
        437     0     246516        450     0     258908     0
        458     0     200348        481     0     157800     0
        423     0     250006        427     0      97723     0
        367     0     261323        349     0      96405     0
        592     0     178843        556     0      91032     0
        450     0     248174        432     0     112650     0
        390     0     244258        349     0     106532     0
        442     0     303159        392     0     125664     0
        407     0     237804        350     0      84367     0
        460     0     207586        456     0      86633     0
        364     0     200758        366     0      99798     0
        313     0     241031        325     0      85746     0
        463     0     248284        470     0     113385     0

With above protocols, wont matter if I run with the default queue len, 
with -q 4096 or -q 40960. Its always never enough and I get the "input 
queue full" logs.

However, if I remove add only 1 or 2 protocols (tried bittorrend only 
and later, edonkey only), things do work fine, for a few seconds, but than:

Aug  4 10:22:58 ourofino ipfw-classifyd: Loaded Protocol: bittorrent 
(rule 50000)
Aug  4 10:24:32 ourofino ipfw-classifyd: Loaded Protocol: bittorrent 
(rule 50000)
Aug  4 10:28:02 ourofino ipfw-classifyd: unable to write to divert 
socket: Invalid argument
Aug  4 10:28:33 ourofino last message repeated 20 times
Aug  4 10:29:03 ourofino last message repeated 18 times
Aug  4 10:29:05 ourofino kernel: pid 5654 (ipfw-classifyd), uid 0: 
exited on signal 6 (core dumped)
Aug  4 10:30:06 ourofino ipfw-classifyd: Loaded Protocol: bittorrent 
(rule 50000)
Aug  4 10:30:08 ourofino ipfw-classifyd: packet dropped: input queue full
Aug  4 10:30:09 ourofino last message repeated 186 times
Aug  4 10:30:17 ourofino ipfw-classifyd: unable to write to divert 
socket: Invalid argument
Aug  4 10:30:27 ourofino last message repeated 2 times
Aug  4 10:32:43 ourofino ipfw-classifyd: Loaded Protocol: edonkey (rule 
50000)
Aug  4 10:33:04 ourofino ipfw-classifyd: unable to write to divert 
socket: Invalid argument
Aug  4 10:33:22 ourofino last message repeated 11 times
Aug  4 10:33:23 ourofino kernel: pid 5901 (ipfw-classifyd), uid 0: 
exited on signal 6 (core dumped)

I get the " unable to write to divert socket: Invalid argument" 
constantly and when classifyd dies (core dump with signal 6) I get the 
following debug output:

Assertion failed: (datalen >= 0), function classify_pthread, file 
classifyd.c, line 646.

What other useful output I can send you?

-- 
Patrick Tracanelli



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48970597.3030407>