From owner-freebsd-net@FreeBSD.ORG Sun Jun 29 15:24:24 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 941F91065671 for ; Sun, 29 Jun 2008 15:24:24 +0000 (UTC) (envelope-from paul@gtcomm.net) Received: from atlas.gtcomm.net (atlas.gtcomm.net [67.215.15.242]) by mx1.freebsd.org (Postfix) with ESMTP id 553608FC15 for ; Sun, 29 Jun 2008 15:24:24 +0000 (UTC) (envelope-from paul@gtcomm.net) Received: from c-76-108-179-28.hsd1.fl.comcast.net ([76.108.179.28] helo=[192.168.1.6]) by atlas.gtcomm.net with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.67) (envelope-from ) id 1KCyi6-00023Y-EP; Sun, 29 Jun 2008 11:21:02 -0400 Message-ID: <4867A9A1.9070507@gtcomm.net> Date: Sun, 29 Jun 2008 11:26:25 -0400 From: Paul User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: Ingo Flaschberger References: <4867420D.7090406@gtcomm.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Net , andrew@modulus.org Subject: Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jun 2008 15:24:24 -0000 Polling makes no difference.. It uses the cpus in a slightly different way but the pps rate is similar.. I tried different HZ settings, I edited kern_poll so i could have a burst max of 8000.. Polling doesn't do anything any more. The only thing I noticed it does is lower the latency on packets when the cpu is idle (i.e. not many pps going through) Hardware system is dual opteron 2212 with intel pci express server NIC dual port. em0: Flow control watermarks high = 30720 low = 29220 em0: tx_int_delay = 66, tx_abs_int_delay = 66 em0: rx_int_delay = 64, rx_abs_int_delay = 98 em0: fifo workaround = 0, fifo_reset_count = 0 input (em0) output packets errs bytes packets errs bytes colls 384700 32541 23851406 457 0 24876 0 385543 25403 23903670 459 0 24910 0 383906 24245 23802188 435 0 23738 0 383656 25182 23786676 427 0 23128 0 em0: tx_int_delay = 66, tx_abs_int_delay = 66 em0: rx_int_delay = 0, rx_abs_int_delay = 98 em0: fifo workaround = 0, fifo_reset_count = 0 input (em0) output packets errs bytes packets errs bytes colls 393787 11217 24414800 461 0 25012 0 390227 16909 24194076 439 0 23776 0 389938 15321 24176158 433 0 23506 0 388685 19562 24098474 449 0 24370 0 392908 11242 24360300 465 0 25234 0 387329 19426 24014402 440 0 23938 0 Ingo Flaschberger wrote: > Dear Paul, > > tried interface polling? > > what hardware system? how are the nic's connected? > > Kind regards, > ingo flaschberger > > geschaeftsleitung > --------------------------- > netstorage-crossip-flat:fee > powered by > crossip communications gmbh > --------------------------- > sebastian kneipp gasse 1 > a-1020 wien > fix: +43-1-726 15 22-217 > fax: +43-1-726 15 22-111 > --------------------------- > On Sun, 29 Jun 2008, Paul wrote: > >> This is just a question but who can get more than 400k pps forwarding >> performance ? >> I have tested fbsd 6/7/8 so far with many different configs. (all >> using intel pci-ex nic and SMP) >> fbsd 7-stable/8(current) seem to be the fastest and always hit this >> ceiling of 400k pps. Soon as it hits that I get errors galore. >> Received no buffers, missed packets, rx overruns.. It's because 'em0 >> taskq' is 90% cpu or so.. >> Now, while this is happening I have two CPU's 100% idle, and the >> other two CPUs are about 60%/20% .. >> So why in the world can't it use more cpus? Simple test setup: >> packet generator on em0 >> destination out em1 >> have to have ip forwarding and fastforwarding on (fastforward >> definitely makes a big difference, another 100kpps or so, without it >> can barely hit 300k) >> Packets are TCP, randomized sources, randomized ports for src and >> dst, single destination ip. >> I even tried the yandex driver in FBSD6 but it could barely even get >> 200k pps and it had a lot of weird issues, and fbsd6 couldn't hit >> 400k pps by itself. >> I am not using polling, that seems to make no difference, i tried >> that too. >> So question. What can I do for more performance (SMP)? Are there any >> good kernel options? >> If I disable ip forwarding i can do 750kpps with no errors because >> it's not going anywhere..em0 taskq cpu usage is less than half of >> what it is when it's forwarding. so obviously the issue is somewhere >> in the forwarding path and fastforwarding greatly helps!! see below. >> forwarding off: >> input (em0) output >> packets errs bytes packets errs bytes colls >> 757223 0 46947830 1 0 226 0 >> 753551 0 46720166 1 0 178 0 >> 756359 0 46894262 1 0 178 0 >> 757570 0 46969344 1 0 178 0 >> 753724 0 46730830 1 0 178 0 >> 745372 0 46213130 1 0 178 0 >> >> >> (I had to slow down the packet generation to about 420-430kpps) >> forwarding on: >> input (em0) output >> packets errs bytes packets errs bytes colls >> 285918 151029 17726936 460 0 25410 0 >> 284929 146151 17665602 417 0 22642 0 >> 284253 147000 17623690 442 0 23884 0 >> 285438 147765 17697160 448 0 24316 0 >> 286582 147171 17768088 456 0 24748 0 >> 287194 147088 17806032 422 0 22912 0 >> 285812 141713 17720348 440 0 23884 0 >> 284958 137579 17667412 457 0 25104 0 >> >> fastforwarding on: >> >> input (em0) output >> packets errs bytes packets errs bytes colls >> 399795 22790 24787310 459 0 25130 0 >> 397425 25254 24640354 434 0 23560 0 >> 403223 26937 24999830 431 0 23452 0 >> 396587 21431 24588398 467 0 25288 0 >> 400970 25776 24860144 459 0 24910 0 >> 397819 23657 24664782 432 0 23452 0 >> 406222 27418 25185768 432 0 23506 0 >> 406718 12407 25216520 461 0 25018 0 >> >> PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND >> 11 root 171 ki31 0K 64K CPU1 1 29:24 100.00% {idle: cpu1} >> 11 root 171 ki31 0K 64K RUN 0 28:46 100.00% {idle: cpu0} >> 11 root 171 ki31 0K 64K CPU3 3 24:32 84.62% {idle: cpu3} >> 0 root -68 0 0K 128K CPU2 2 12:59 84.13% {em0 taskq} >> 0 root -68 0 0K 128K - 3 2:12 19.92% {em1 taskq} >> 11 root 171 ki31 0K 64K RUN 2 19:46 19.63% {idle: cpu2} >> >> >> >> Well if anything.. at least it's a good show of the difference >> fastforwarding makes!! :) >> I have >> options NO_ADAPTIVE_MUTEXES ## Improve routing performance? >> options STOP_NMI # Stop CPUS using NMI instead >> of IPI >> no IPV6 >> no firewall loaded >> no netgraph >> HZ is 4000 >> em driver is 4096 on receive buffers >> using VLAN devices (em1 output) >> Tested on Xeon and Opteron processor >> Don't have exact results. >> Above results are dual opteron 2212 with freebsd current >> FreeBSD 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Sat Jun 28 23:37:39 CDT >> 2008 Well I'm curious of the results of others.. >> >> Thanks for reading!! :) >> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >