Date: Tue, 05 Aug 2008 16:50:29 -0300 From: Patrick Tracanelli <eksffa@freebsdbrasil.com.br> To: Mike Makonnen <mtm@wubethiopia.com> Cc: freebsd-net@freebsd.org Subject: Re: Application layer classifier for ipfw Message-ID: <4898AF05.5020505@freebsdbrasil.com.br> In-Reply-To: <48970597.3030407@freebsdbrasil.com.br> References: <48918DB5.7020201@wubethiopia.com> <4891CD13.20600@freebsdbrasil.com.br> <48922E9D.1020507@elischer.org> <4893328C.2040105@freebsdbrasil.com.br> <4894447C.3000800@wubethiopia.com> <48945A79.50300@wubethiopia.com> <48970597.3030407@freebsdbrasil.com.br>
next in thread | previous in thread | raw e-mail | index | archive | help
Patrick Tracanelli escreveu: > Mike Makonnen escreveu: >> Mike Makonnen wrote: >>> Patrick Tracanelli wrote: >>>> >>>> To let you know of my current (real world) tests: >>>> >>>> - Wireless Internet Provider 1: >>>> - 4Mbit/s of Internet Traffic >>>> - Classifying default protocols + soulseek + ssh >>>> - Classifying 100Mbit/s of dump over ssh >>>> >>>> Results in: >>>> No latency added, very low CPU usage, no packets dropping. >>>> >>>> - Wireless ISP 2: >>>> - 21 Mbit/s of Internet Traffic >>>> - Classifying default protocols + soulseek + ssh >>>> >>>> Results in: >>>> No tcp or udp traffic at all; everything that gets diverted >>>> never comes out of the divert socket, and ipfw-classifyd logs >>>> >>>> Aug 1 12:07:35 ourofino last message repeated 58 times >>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: bittorrent >>>> (rule 50000) >>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: edonkey >>>> (rule 50000) >>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: fasttrack >>>> (rule 50000) >>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: gnutella >>>> (rule 1000) >>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: soulseek >>>> (rule 50000) >>>> Aug 1 12:17:54 ourofino ipfw-classifyd: Loaded Protocol: ssh >>>> (rule 50000) >>>> Aug 1 12:18:28 ourofino ipfw-classifyd: unable to write to divert >>>> socket: Operation not permitted >>>> Aug 1 12:18:50 ourofino last message repeated 90 times >>> >>> Hmmm... this part means that the call to sendto(2) to write the >>> packet back into network stack failed. This explains why you are not >>> seein g any traffic comming back out of the divert socket, but I >>> don't see why it would suddenly fail with a permission error. Could >>> this be a kernel bug? >>>> Aug 1 12:18:51 ourofino ipfw-classifyd: packet dropped: input queue >>>> full >>>> Aug 1 12:19:11 ourofino last message repeated 94 times >>>> >>>> Raised queue len a lot (up to 40960), when the application starts it >>>> uses up to 25% CPU and a second after that, CPU usage gets lower the >>>> 0.1%. >>> >>> This looks like a deadlock. If it weren't able to process packets >>> fast enough the cpu usage should be high even as it's spewing "packet >>> dropped" messages. Can you send me some more information like memory >>> usage and the firewall script you are using? How much of the >>> 21Mbits/s of traffic is P2P? If you reduce the number of protocols >>> you are trying to match against does the behavior change? Using >>> netstat -w1 -I<interface> can you tell me how many packets per second >>> we're talking about for 4Mbits/s and 21Mbit/s? Also, the timestamps >>> from the log file seem to show that the daemon is running for approx. >>> 34 sec. before the first "unable to write to write to divert socket" >>> message. Is it passing traffic during this time? Thanks. >>> >>> I've uploaded a newer version. Can you try that also please. It >>> includes: >>> o SIGHUP forces it to re-read its configuration file >>> o rc.d script >>> o minor optimization (calls pthread_cond_signal with the mutex >>> unlocked) >>> o code cleanup >>> >>> Also, for your convenience I have attached a patch against the >>> earlier version that removes a debugging printf that should remove >>> spammage to your log files (the current version has it removed already). >>> >> >> Ooops, a few minutes after I sent this email I found a couple of bugs >> (one major, and one minor). They were in the original tarball as well >> as the newer one I uploaded earlier today. I've uploaded a fixed >> version of the code. Can you try that instead please. >> >> Also, to help track down performance issues I've modified the Makefile >> to build a profiled version of the application so you can use gprof(1) >> to figure out where any problems lie. >> >> Cheers. >> > > On gcc 3.4.6, -Wno-pointer-sign compiler switch is unknown and > therefore compiling fails: > > cc -O2 -fno-strict-aliasing -pipe -O2 -pipe -pg -g -Wsystem-headers > -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter > -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type > -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wcast-align > -Wunused-parameter -Wchar-subscripts -Winline -Wnested-externs > -Wredundant-decls -Wno-pointer-sign -c classifyd.c > cc1: error: unrecognized command line option "-Wno-pointer-sign" > *** Error code 1 > > Stop in /usr/home/setup/ipfw-classifyd. > > If removed, does compile fine. > > On amd64 does not compile: > > cc -O2 -fno-strict-aliasing -pipe -O2 -pipe -pg -g -Wsystem-headers > -Werror -Wall -Wno- > format-y2k -W -Wno-unused-parameter -Wstrict-prototypes > -Wmissing-prototypes -Wpointer > -arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow > -Wcast-align -Wunus > ed-parameter -Wchar-subscripts -Winline -Wnested-externs > -Wredundant-decls -Wno-pointer > -sign -fPIC -pipe -c classifyd.c > cc1: warnings being treated as errors > classifyd.c: In function 'write_pthread': > classifyd.c:813: warning: format '%d' expects type 'int', but argument 4 > has type 'size_ > t' > *** Error code 1 > > The rules Im running: > > 00001 1379 320702 fwd 192.168.100.11 ip from any to { > 201.91.17.200/29 or dst-ip 189.44.216.168/29 } out via fxp0 > 00002 0 0 fwd 192.168.100.11 ip from any to { > 201.91.17.200/29 or dst-ip 189.44.216.168/29 } out via fxp2 > 00003 39137 7760042 fwd 189.52.140.1 ip from 189.52.140.10 to any out > via fxp0 > 00004 34462 10801535 fwd 189.52.140.3 ip from 189.52.140.11 to any out > via fxp0 > 00005 68780 29374414 fwd 189.52.140.1 ip from 189.52.140.18 to any out > via fxp0 > 00006 12154 2003678 fwd 189.52.140.3 ip from 189.52.140.19 to any out > via fxp0 > 00500 4741 770166 divert 7777 tcp from any to any via fxp0 > 00501 172 38599 divert 7777 udp from any to any via fxp0 > 49999 823215 453482966 allow ip from any to any > 65535 0 0 allow ip from any to any > > Memory usage while running classifyd: > > Mem: 67M Active, 27M Inact, 44M Wired, 24K Cache, 111M Buf, 859M Free > Swap: 1024M Total, 1024M Free > > classifyd's CPU usage: > > 5198 root 3 20 0 6132K 5420K kserel 1:14 24.52% > ipfw-classifyd > > System load averga for 5 minutes is 0.2 while not running classifyd and > 0.8 while running. > > Logs with the latest (downloaded a few minutes ago) code: > > Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: bittorrent > (rule 50000) > Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: edonkey (rule > 50000) > Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: fasttrack > (rule 50000) > Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: gnutella (rule > 50000) > Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: soulseek (rule > 50000) > Aug 4 10:15:29 ourofino ipfw-classifyd: Loaded Protocol: ssh (rule 50000) > Aug 4 10:15:36 ourofino ipfw-classifyd: packet dropped: input queue full > Aug 4 10:16:07 ourofino last message repeated 1594 times > Aug 4 10:16:19 ourofino last message repeated 766 times > > I miss the debug lines! ;D > > # netstat -w1 -I fxp0 > input (fxp0) output > packets errs bytes packets errs bytes colls > 535 0 286009 515 0 165965 0 > 416 0 347448 456 0 153585 0 > 444 0 309008 449 0 185087 0 > 493 0 190090 474 0 202637 0 > 547 0 197882 562 0 233751 0 > 409 0 288318 461 0 249421 0 > 434 0 173193 464 0 261512 0 > 437 0 246516 450 0 258908 0 > 458 0 200348 481 0 157800 0 > 423 0 250006 427 0 97723 0 > 367 0 261323 349 0 96405 0 > 592 0 178843 556 0 91032 0 > 450 0 248174 432 0 112650 0 > 390 0 244258 349 0 106532 0 > 442 0 303159 392 0 125664 0 > 407 0 237804 350 0 84367 0 > 460 0 207586 456 0 86633 0 > 364 0 200758 366 0 99798 0 > 313 0 241031 325 0 85746 0 > 463 0 248284 470 0 113385 0 > > With above protocols, wont matter if I run with the default queue len, > with -q 4096 or -q 40960. Its always never enough and I get the "input > queue full" logs. > > However, if I remove add only 1 or 2 protocols (tried bittorrend only > and later, edonkey only), things do work fine, for a few seconds, but than: > > Aug 4 10:22:58 ourofino ipfw-classifyd: Loaded Protocol: bittorrent > (rule 50000) > Aug 4 10:24:32 ourofino ipfw-classifyd: Loaded Protocol: bittorrent > (rule 50000) > Aug 4 10:28:02 ourofino ipfw-classifyd: unable to write to divert > socket: Invalid argument > Aug 4 10:28:33 ourofino last message repeated 20 times > Aug 4 10:29:03 ourofino last message repeated 18 times > Aug 4 10:29:05 ourofino kernel: pid 5654 (ipfw-classifyd), uid 0: > exited on signal 6 (core dumped) > Aug 4 10:30:06 ourofino ipfw-classifyd: Loaded Protocol: bittorrent > (rule 50000) > Aug 4 10:30:08 ourofino ipfw-classifyd: packet dropped: input queue full > Aug 4 10:30:09 ourofino last message repeated 186 times > Aug 4 10:30:17 ourofino ipfw-classifyd: unable to write to divert > socket: Invalid argument > Aug 4 10:30:27 ourofino last message repeated 2 times > Aug 4 10:32:43 ourofino ipfw-classifyd: Loaded Protocol: edonkey (rule > 50000) > Aug 4 10:33:04 ourofino ipfw-classifyd: unable to write to divert > socket: Invalid argument > Aug 4 10:33:22 ourofino last message repeated 11 times > Aug 4 10:33:23 ourofino kernel: pid 5901 (ipfw-classifyd), uid 0: > exited on signal 6 (core dumped) > > I get the " unable to write to divert socket: Invalid argument" > constantly and when classifyd dies (core dump with signal 6) I get the > following debug output: > > Assertion failed: (datalen >= 0), function classify_pthread, file > classifyd.c, line 646. > > What other useful output I can send you? > I tried figuring out myself some extra clues, but still stucked where I began. -- Patrick Tracanelli FreeBSD Brasil LTDA. Tel.: (31) 3516-0800 316601@sip.freebsdbrasil.com.br http://www.freebsdbrasil.com.br "Long live Hanin Elias, Kim Deal!"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4898AF05.5020505>