Date: Fri, 27 Dec 2013 15:33:19 +0400 From: "Alexander V. Chernikov" <melifaro@FreeBSD.org> To: "Denis V. Klimkov" <falcon@tcm.by> Cc: freebsd-net@freebsd.org Subject: Re: ipfw verrevpath performance broken in 9.2 Message-ID: <52BD657F.1010405@FreeBSD.org> In-Reply-To: <27299961.20131227141638@tcm.by> References: <21356442.20131227093416@tcm.by> <52BD5598.9020100@FreeBSD.org> <27299961.20131227141638@tcm.by>
next in thread | previous in thread | raw e-mail | index | archive | help
On 27.12.2013 15:16, Denis V. Klimkov wrote: > Hello Alexander, > > Friday, December 27, 2013, 1:25:28 PM, you wrote: > >>> Recently upgraded router system from 9.0-RELEASE to 9.2-STABLE and >>> got 100% CPU utilisation on all cores with interrupts under the same >>> load that had about 25-30% CPU utilisation before. Of course that lead > AVC> Looks interesting. > AVC> Are you sure all other configs/data load are the same? > > Yes, everything was the same. Later changed NIC from 4 igbs to 1 ix. > > AVC> I'm particularly interested in changes in: number of NIC queues, their > AVC> bindings and firewall ruleset. > > igb0: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port 0x3020-0x303f mem 0xc6b20000-0xc6b3ffff,0xc6b44000-0xc6b47fff irq 40 at device 0.0 on pci1 > igb0: Using MSIX interrupts with 5 vectors > igb0: Ethernet address: 00:15:17:b9:ef:dc > igb0: Bound queue 0 to cpu 0 > igb0: Bound queue 1 to cpu 1 > igb0: Bound queue 2 to cpu 2 > igb0: Bound queue 3 to cpu 3 > igb1: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port 0x3000-0x301f mem 0xc6b00000-0xc6b1ffff,0xc6b40000-0xc6b43fff irq 28 at device 0.1 on pci1 > igb1: Using MSIX interrupts with 5 vectors > igb1: Ethernet address: 00:15:17:b9:ef:dd > igb1: Bound queue 0 to cpu 4 > igb1: Bound queue 1 to cpu 5 > igb1: Bound queue 2 to cpu 6 > igb1: Bound queue 3 to cpu 7 > pcib2: <ACPI PCI-PCI bridge> irq 24 at device 3.0 on pci0 > pci2: <ACPI PCI bus> on pcib2 > pcib3: <ACPI PCI-PCI bridge> irq 26 at device 5.0 on pci0 > pci3: <ACPI PCI bus> on pcib3 > igb2: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port 0x2020-0x203f mem 0xc6420000-0xc643ffff,0xc6000000-0xc63fffff,0xc64c4000-0xc64c7fff irq 26 at device 0.0 on pci > 3 > igb2: Using MSIX interrupts with 5 vectors > igb2: Ethernet address: 00:1b:21:4a:69:78 > igb2: Bound queue 0 to cpu 8 > igb2: Bound queue 1 to cpu 9 > igb2: Bound queue 2 to cpu 10 > igb2: Bound queue 3 to cpu 11 > igb3: <Intel(R) PRO/1000 Network Connection version - 2.3.10> port 0x2000-0x201f mem 0xc6400000-0xc641ffff,0xc5c00000-0xc5ffffff,0xc64c0000-0xc64c3fff irq 25 at device 0.1 on pci > 3 > igb3: Using MSIX interrupts with 5 vectors > igb3: Ethernet address: 00:1b:21:4a:69:79 > igb3: Bound queue 0 to cpu 12 > igb3: Bound queue 1 to cpu 13 > igb3: Bound queue 2 to cpu 14 > igb3: Bound queue 3 to cpu 15 > > 09000 546827 20995102 deny ip from any to 224.0.0.0/8 > 09900 251418446 34849277439 fwd 127.0.0.1,3333 tcp from table(100) to not table(9) dst-port 80 > 09901 251226827 74150859375 allow tcp from any 80 to table(100) out > 09999 324676485 22931487657 deny ip from not table(9) to table(100) > 09999 93075888 5276322115 deny ip from table(100) to not table(9) > 10000 234714177213 241730704799083 allow ip from table(5) to any > 10005 245356169 18235355072 deny ip from any to any dst-port 135,137-139,445 out > 10006 2929342953 182985124889 deny ip from table(104) to any > 10020 688240709 620932403164 divert 8668 ip from any to 1.1.1.1 > 10400 682416642 620798165276 allow ip from any to any diverted > > 10770 73183544 9041870946 deny ip from table(2) to any out via vlan18 > 10772 11698 802274 deny ip from table(3) to any out via vlan4 > 10773 8807403 463870927 deny ip from any to table(2) out iptos reliability > 10774 4923414 300617694 deny ip from any to table(3) out iptos reliability > 10775 99485 4397077 deny ip from any to table(3) out iptos throughput It is probably worth doing something like putting all verrevpath before upper out rules, and write explicit 'skipto X ip from any to any out' ; and something like 'allow ip from any to any in' after all verrevpath rules to split in/out ruleset. > > 11010 3659429 430047150 deny ip from any to any not verrevpath in via vlan6 > 11020 719931 58619220 deny ip from any to any not verrevpath in via vlan7 > 11025 68141 5144481 deny ip from any to any not verrevpath in via vlan8 > 11030 202144 6785732 deny ip from any to any not verrevpath in via vlan9 > 11040 171291 56196945 deny ip from any to any not verrevpath in via vlan10 > 11045 291914032 39427773226 deny ip from any to any not verrevpath in via vlan11 > 11060 6102962 441745213 deny ip from any to any not verrevpath in via vlan15 > 11070 4832442 1259880158 deny ip from any to any not verrevpath in via vlan16 > 11080 814769 95745079 deny ip from any to any not verrevpath in via vlan17 > 11101 2901098 628552748 deny ip from any to any not verrevpath in via vlan26 > 11102 1264750 146468688 deny ip from any to any not verrevpath in via vlan27 > 11110 902441 294155831 deny ip from any to any not verrevpath in via vlan21 > 11120 628324 31060933 deny ip from any to any not verrevpath in via vlan23 > 11130 1381 83245 deny ip from any to any not verrevpath in via vlan24 > 11138 4258607 3389925416 deny ip from any to any not verrevpath in via vlan31 > 11150 56 2792 deny ip from any to any not verrevpath in via vlan40 (X) > 15000 3363576 188412499 deny ip from not table(30) to table(31) out > 19950 64832991 3461330324 deny tcp from table(25) to not table(8) dst-port 25 out > 19960 693595 34424883 deny ip from table(101) to table(103) out > 19970 466690 57539243 deny ip from not table(30) to me dst-port 161,162,21,3306 > 20000 35523656903 32569055261754 pipe tablearg ip from any to table(1) out iptos reliability > 20010 36208900912 9635678183009 pipe tablearg ip from table(6) to any out via vlan18 > 20020 6963415930 5823875049163 pipe tablearg ip from any to table(10) out > 20030 5370808609 1175572076679 pipe tablearg ip from table(11) to any out > 60005 3749710 1625777707 deny udp from any to 2.2.2.100 dst-port 5060 > 60005 7940451 2910219814 deny udp from any to 2.2.2.1 dst-port 5060 > 60020 578206 71125954 divert 8668 ip from 192.168.0.0/16 to any out via vlan4 > 60020 120740 17363073 divert 8668 ip from 192.168.0.0/16 to any out via vlan5 > 60020 6485285 2421107818 divert 8668 ip from 192.168.0.0/16 to any out via vlan18 > 60020 22096 1876197 divert 8668 ip from 192.168.0.0/16 to any out via vlan11 > 60600 529456103 183816441399 allow ip from any to any diverted > 62110 2482047796 207871928397 deny ip from not table(32) to any out via vlan18 > 62120 34184526 40243097237 allow ip from 3.3.3.0/24 to 3.3.3.0/24 via vlan4 > 62130 19323045 1282467423 deny ip from not table(32) to any out via vlan4 > 62140 21168902 1790816969 deny ip from any to not table(32) in via vlan4 > 64000 8160465887601 5338926261446363 allow ip from any to any > 65000 1165747 214509370 allow ip from any to any > 65535 5625 3645710 deny ip from any to any > > AVC> Can you share your traffic rate (e.g. netstat -i -w1), cpu info and NIC > AVC> info? > > Now it's: > # netstat -i -w1 > input (Total) output > packets errs idrops bytes packets errs bytes colls > 312136 0 0 216478043 312375 0 216359751 0 > 311760 0 0 217559784 311654 0 217792531 0 > 295196 0 0 203318550 295319 0 211926680 0 > 300204 0 0 206880841 300219 0 206348483 0 > 297019 0 0 203171215 296930 0 207103301 0 > 308142 0 0 211553806 308294 0 207969407 0 > 320911 0 0 221584256 320955 0 218811245 0 > > CPU: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (2261.30-MHz 686-class CPU) > > AVC> What does system load (without verrevpath) looks like in comparison with > AVC> 9.0 (in terms of CPU _and_ packets/sec) ? > > Sorry, cannot compare it. Old graphs are lost. AFAIR it was up to 30 LA > in peak times when there was about 400+ kpss in and same out. I can > try to add some rules with verrevpath now in 9.2 system. > > Without verrevpath rules top ISHP shows: > last pid: 58440; load averages: 2.52, 2.52, 2.51 up 1+06:25:38 14:05:02 > 268 processes: 17 running, 177 sleeping, 74 waiting > CPU 0: 0.0% user, 0.0% nice, 0.0% system, 28.2% interrupt, 71.8% idle > CPU 1: 0.0% user, 0.0% nice, 0.0% system, 38.0% interrupt, 62.0% idle > CPU 2: 0.4% user, 0.0% nice, 0.8% system, 29.8% interrupt, 69.0% idle > CPU 3: 0.0% user, 0.0% nice, 0.4% system, 26.7% interrupt, 72.9% idle > CPU 4: 0.0% user, 0.0% nice, 0.8% system, 32.5% interrupt, 66.7% idle > CPU 5: 0.0% user, 0.0% nice, 0.8% system, 31.4% interrupt, 67.8% idle > CPU 6: 0.0% user, 0.0% nice, 0.0% system, 30.2% interrupt, 69.8% idle > CPU 7: 0.0% user, 0.0% nice, 0.0% system, 32.2% interrupt, 67.8% idle > CPU 8: 0.0% user, 0.0% nice, 0.8% system, 0.0% interrupt, 99.2% idle > CPU 9: 0.8% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.2% idle > CPU 10: 0.4% user, 0.0% nice, 1.2% system, 0.0% interrupt, 98.4% idle > CPU 11: 0.0% user, 0.0% nice, 0.0% system, 0.8% interrupt, 99.2% idle > CPU 12: 0.4% user, 0.0% nice, 0.0% system, 0.8% interrupt, 98.8% idle > CPU 13: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle > CPU 14: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > CPU 15: 0.0% user, 0.0% nice, 0.8% system, 0.0% interrupt, 99.2% idle > > netstat -iw 1 > input (Total) output > packets errs idrops bytes packets errs bytes colls > 322k 0 0 219M 322k 0 220M 0 > 324k 0 0 224M 324k 0 222M 0 > 325k 0 0 227M 325k 0 227M 0 > 352k 0 0 247M 352k 0 242M 0 > > After adding verrevpath rules: > last pid: 58471; load averages: 3.19, 2.82, 2.64 up 1+06:30:04 14:09:28 > 270 processes: 21 running, 179 sleeping, 70 waiting > CPU 0: 0.0% user, 0.0% nice, 0.4% system, 51.4% interrupt, 48.2% idle > CPU 1: 0.0% user, 0.0% nice, 0.4% system, 44.7% interrupt, 54.9% idle > CPU 2: 0.0% user, 0.0% nice, 0.8% system, 37.6% interrupt, 61.6% idle > CPU 3: 0.0% user, 0.0% nice, 0.0% system, 38.8% interrupt, 61.2% idle > CPU 4: 0.4% user, 0.0% nice, 0.0% system, 38.8% interrupt, 60.8% idle > CPU 5: 0.0% user, 0.0% nice, 0.4% system, 41.2% interrupt, 58.4% idle > CPU 6: 0.4% user, 0.0% nice, 0.4% system, 43.9% interrupt, 55.3% idle > CPU 7: 0.0% user, 0.0% nice, 0.0% system, 41.6% interrupt, 58.4% idle > > Looks like now this rules does not affect load such a way it was > before. But now NICs configuration differs. There were > ifconfig_lagg0="laggproto loadbalance laggport igb0 laggport igb1 laggport igb2 laggport igb3" > and all vlans over lagg0. Now it is one ix0 without lagg and all vlans > are over ix0. Lagg (and queue bindings for multiple interfaces) is _very_ different story. Are you willing to test some patches related to verrevpath / general performance improvements? Btw, do you have fastforwarding turned on? > > --- > Denis V. Klimkov > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52BD657F.1010405>