Date: Fri, 14 Feb 2020 12:00:25 -0800 From: BulkMailForRudy <crapsh@monkeybrains.net> To: freebsd-net@freebsd.org Subject: Re: Issue with BGP router / high interrupt / Chelsio / FreeBSD 12.1 Message-ID: <9e1c0666-3dea-f946-24d4-e2dea48b30af@monkeybrains.net> In-Reply-To: <CA%2Bq%2BTcrUyzfNtLGD4Vtc3a0v5MHcxVvED=eK57aGL21LTQzL4w@mail.gmail.com> References: <1aa78c6e-e640-623c-73d3-473df132eb72@monkeybrains.net> <c921825a-3a9c-cc15-78e6-c7e3776ab12a@monkeybrains.net> <bb6c3997-c369-28c3-9d85-c9cca526e093@monkeybrains.net> <a4c98e33-3aae-f08b-4132-52350a33a56c@monkeybrains.net> <428f3cdf-9035-90a7-14f8-f294c2131682@monkeybrains.net> <CA%2Bq%2BTcrUyzfNtLGD4Vtc3a0v5MHcxVvED=eK57aGL21LTQzL4w@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2/14/20 10:00 AM, Olivier Cochard-Labbé wrote: > On Fri, Feb 14, 2020 at 6:25 PM Rudy <crapsh@monkeybrains.net> wrote: > >> On 2/12/20 7:21 PM, Rudy wrote: >> > I'm having issues with a box that is acting as a BGP router for my >> network. 3 Chelsio cards, two T5 and one T6. It was working great >> until I turned up our first port on the T6. It seems like traffic >> passing in from a T5 card and out the T6 causes a really high load (and >> high interrupts). >> >> >> Looking better! I made some changes based on BSDRP which I hadn't known >> about -- I think ifqmaxlen was the tunable I overlooked. >> >> # >> >> https://github.com/ocochard/BSDRP/blob/master/BSDRP/Files/boot/loader.conf.local >> net.link.ifqmaxlen="16384" >> >> > This net.link.ifqmaxlen was set to help in case of lagg usage: I was not > aware it could improve your use case. oThanks for the feedback. Maybe it was a coincidence. Load has creep back up to 15. > From your first post, it looks like your setup is a 2 packages, 10 cores, > 20 threads (disabled). > And you have configured your Chelsio to use 16 queues (hw.cxgbe.Xrx=16): > It's a good think to have a power of 2 number of queues with Chelsio, but > I'm not sure it's a good idea to spread those queue across the 2 packages. > So perhaps you should try: > 1. To reduce queues to 8 queues and bind them to the local domain > 2. Or keeping 16 queues, but re-enabling HyperThreading and bing them to > the local domain too. (on -head with recent CPU > and machdep.hyperthreading_intr_allowed, using hyper-threading improve > forwarding performance). > > But anyway even with 16 queues spread over 2 domains, you should have > better performance: > https://github.com/ocochard/netbenches/blob/master/Xeon_E5-2650v4_2x12Cores-Chelsio_T520-CR/hw.cxgbe.nXxq/results/fbsd12-stable.r354440.BSDRP.1.96/README.md OK, I can work on the chelsio_affinity script. .... hour later ... OK, tested and updated on github. > Notice that I never monitoring the CPU load during my benches. > Increasing the hw.cxgbe.holdoff_timer_idx was a good idea: I would expect > lower interrupt usage too. I've have some standard SNMP monitoring and can correlate the load spinning out of control to ping loss and packet loss. # vmstat -i | tail -1 Total 12217353774 324329 > Did you monitor the QPI link usage ? (kldload cpuctl && pcm-numa.x) I haven't. I'll look into that. Hoping the numa-domain locking helps. Currently I have things bound to the right domain, just need to shrink the queue size and reboot! irq289: t6nex0:err:261 @cpu0(domain0): 0 irq290: t6nex0:evt:263 @cpu0(domain0): 4 irq291: t6nex0:0a0:265 @cpu1(domain0): 0 irq292: t6nex0:0a1:267 @cpu2(domain0): 0 irq293: t6nex0:0a2:269 @cpu3(domain0): 0 irq294: t6nex0:0a3:271 @cpu4(domain0): 0 irq295: t6nex0:0a4:273 @cpu5(domain0): 0 irq296: t6nex0:0a5:275 @cpu6(domain0): 0 irq297: t6nex0:0a6:277 @cpu7(domain0): 0 irq298: t6nex0:0a7:279 @cpu8(domain0): 0 irq299: t6nex0:0a8:281 @cpu9(domain0): 0 irq300: t6nex0:0a9:283 @cpu1(domain0): 0 irq301: t6nex0:0aa:285 @cpu2(domain0): 0 irq302: t6nex0:0ab:287 @cpu3(domain0): 0 irq303: t6nex0:0ac:289 @cpu4(domain0): 0 irq304: t6nex0:0ad:291 @cpu5(domain0): 0 irq305: t6nex0:0ae:293 @cpu6(domain0): 0 irq306: t6nex0:0af:295 @cpu7(domain0): 0 irq307: t6nex0:1a0:297 @cpu8(domain0): 185404641 irq308: t6nex0:1a1:299 @cpu9(domain0): 146802111 irq309: t6nex0:1a2:301 @cpu1(domain0): 133930820 irq310: t6nex0:1a3:303 @cpu2(domain0): 173156318 irq311: t6nex0:1a4:305 @cpu3(domain0): 132151349 irq312: t6nex0:1a5:307 @cpu4(domain0): 149108252 irq313: t6nex0:1a6:309 @cpu5(domain0): 149196634 irq314: t6nex0:1a7:311 @cpu6(domain0): 184211395 irq315: t6nex0:1a8:313 @cpu7(domain0): 151266056 irq316: t6nex0:1a9:315 @cpu8(domain0): 169259534 irq317: t6nex0:1aa:317 @cpu9(domain0): 164117244 irq318: t6nex0:1ab:319 @cpu1(domain0): 157471862 irq319: t6nex0:1ac:321 @cpu2(domain0): 127662140 irq320: t6nex0:1ad:323 @cpu3(domain0): 172750013 irq321: t6nex0:1ae:325 @cpu4(domain0): 173559485 irq322: t6nex0:1af:327 @cpu5(domain0): 227842473 irq323: t5nex0:err:329 @cpu0(domain1): 0 irq324: t5nex0:evt:331 @cpu0(domain1): 8 irq325: t5nex0:0a0:333 @cpu10(domain1): 1340449 irq326: t5nex0:0a1:335 @cpu11(domain1): 1128580 irq327: t5nex0:0a2:337 @cpu12(domain1): 1311599 irq328: t5nex0:0a3:339 @cpu13(domain1): 1157356 irq329: t5nex0:0a4:341 @cpu14(domain1): 1257426 irq330: t5nex0:0a5:343 @cpu15(domain1): 1169697 irq331: t5nex0:0a6:345 @cpu16(domain1): 1089689 irq332: t5nex0:0a7:347 @cpu17(domain1): 1117782 irq333: t5nex0:0a8:349 @cpu18(domain1): 1186770 irq334: t5nex0:0a9:351 @cpu19(domain1): 1147015 irq335: t5nex0:0aa:353 @cpu10(domain1): 1238148 irq336: t5nex0:0ab:355 @cpu11(domain1): 1134259 irq337: t5nex0:0ac:357 @cpu12(domain1): 1262301 irq338: t5nex0:0ad:359 @cpu13(domain1): 1233933 irq339: t5nex0:0ae:361 @cpu14(domain1): 1284298 irq340: t5nex0:0af:363 @cpu15(domain1): 1257873 irq341: t5nex0:1a0:365 @cpu16(domain1): 204307929 irq342: t5nex0:1a1:367 @cpu17(domain1): 221035308 irq343: t5nex0:1a2:369 @cpu18(domain1): 218431173 irq344: t5nex0:1a3:371 @cpu19(domain1): 197270425 irq345: t5nex0:1a4:373 @cpu10(domain1): 181544184 irq346: t5nex0:1a5:375 @cpu11(domain1): 187715982 irq347: t5nex0:1a6:377 @cpu12(domain1): 184945609 irq348: t5nex0:1a7:379 @cpu13(domain1): 161060780 irq349: t5nex0:1a8:381 @cpu14(domain1): 162546561 irq350: t5nex0:1a9:383 @cpu15(domain1): 188539721 irq351: t5nex0:1aa:385 @cpu16(domain1): 153407315 irq352: t5nex0:1ab:387 @cpu17(domain1): 171904505 irq353: t5nex0:1ac:389 @cpu18(domain1): 163256903 irq354: t5nex0:1ad:391 @cpu19(domain1): 162976257 irq355: t5nex0:1ae:393 @cpu10(domain1): 186167299 irq356: t5nex0:1af:395 @cpu11(domain1): 205566989 irq357: t5nex0:2a0:397 @cpu12(domain1): 113070700 irq358: t5nex0:2a1:399 @cpu13(domain1): 172641475 irq359: t5nex0:2a2:401 @cpu14(domain1): 121577604 irq360: t5nex0:2a3:403 @cpu15(domain1): 109659638 irq361: t5nex0:2a4:405 @cpu16(domain1): 112705459 irq362: t5nex0:2a5:407 @cpu17(domain1): 127206944 irq363: t5nex0:2a6:409 @cpu18(domain1): 109712072 irq364: t5nex0:2a7:411 @cpu19(domain1): 108579249 irq365: t5nex0:2a8:413 @cpu10(domain1): 121687614 irq366: t5nex0:2a9:415 @cpu11(domain1): 100657878 irq367: t5nex0:2aa:417 @cpu12(domain1): 99212108 irq368: t5nex0:2ab:419 @cpu13(domain1): 107358669 irq369: t5nex0:2ac:421 @cpu14(domain1): 114883419 irq370: t5nex0:2ad:423 @cpu15(domain1): 104580916 irq371: t5nex0:2ae:425 @cpu16(domain1): 107601764 irq372: t5nex0:2af:427 @cpu17(domain1): 116284819 irq373: t5nex0:3a0:429 @cpu18(domain1): 341626 irq374: t5nex0:3a1:431 @cpu19(domain1): 254931 irq375: t5nex0:3a2:433 @cpu10(domain1): 273165 irq376: t5nex0:3a3:435 @cpu11(domain1): 254925 irq377: t5nex0:3a4:437 @cpu12(domain1): 254915 irq378: t5nex0:3a5:439 @cpu13(domain1): 254917 irq379: t5nex0:3a6:441 @cpu14(domain1): 254942 irq380: t5nex0:3a7:443 @cpu15(domain1): 254943 irq381: t5nex0:3a8:445 @cpu16(domain1): 254928 irq382: t5nex0:3a9:447 @cpu17(domain1): 254936 irq383: t5nex0:3aa:449 @cpu18(domain1): 254941 irq384: t5nex0:3ab:451 @cpu19(domain1): 254927 irq385: t5nex0:3ac:453 @cpu10(domain1): 255604 irq386: t5nex0:3ad:455 @cpu11(domain1): 254923 irq387: t5nex0:3ae:457 @cpu12(domain1): 254937 irq388: t5nex0:3af:459 @cpu13(domain1): 254931 irq389: t5nex1:err:461 @cpu0(domain1): 0 irq390: t5nex1:evt:463 @cpu0(domain1): 5 irq391: t5nex1:0a0:465 @cpu14(domain1): 0 irq392: t5nex1:0a1:467 @cpu15(domain1): 0 irq393: t5nex1:0a2:469 @cpu16(domain1): 0 irq394: t5nex1:0a3:471 @cpu17(domain1): 0 irq395: t5nex1:0a4:473 @cpu18(domain1): 0 irq396: t5nex1:0a5:475 @cpu19(domain1): 0 irq397: t5nex1:0a6:477 @cpu10(domain1): 0 irq398: t5nex1:0a7:479 @cpu11(domain1): 0 irq399: t5nex1:0a8:481 @cpu12(domain1): 0 irq400: t5nex1:0a9:483 @cpu13(domain1): 0 irq401: t5nex1:0aa:485 @cpu14(domain1): 0 irq402: t5nex1:0ab:487 @cpu15(domain1): 0 irq403: t5nex1:0ac:489 @cpu16(domain1): 0 irq404: t5nex1:0ad:491 @cpu17(domain1): 0 irq405: t5nex1:0ae:493 @cpu18(domain1): 0 irq406: t5nex1:0af:495 @cpu19(domain1): 0 irq407: t5nex1:1a0:497 @cpu10(domain1): 0 irq408: t5nex1:1a1:499 @cpu11(domain1): 0 irq409: t5nex1:1a2:501 @cpu12(domain1): 0 irq410: t5nex1:1a3:503 @cpu13(domain1): 0 irq411: t5nex1:1a4:505 @cpu14(domain1): 0 irq412: t5nex1:1a5:507 @cpu15(domain1): 0 irq413: t5nex1:1a6:509 @cpu16(domain1): 0 irq414: t5nex1:1a7:511 @cpu17(domain1): 0 irq415: t5nex1:1a8:513 @cpu18(domain1): 0 irq416: t5nex1:1a9:515 @cpu19(domain1): 0 irq417: t5nex1:1aa:517 @cpu10(domain1): 0 irq418: t5nex1:1ab:519 @cpu11(domain1): 0 irq419: t5nex1:1ac:521 @cpu12(domain1): 0 irq420: t5nex1:1ad:523 @cpu13(domain1): 0 irq421: t5nex1:1ae:525 @cpu14(domain1): 0 irq422: t5nex1:1af:527 @cpu15(domain1): 0 irq423: t5nex1:2a0:529 @cpu16(domain1): 159872451 irq424: t5nex1:2a1:531 @cpu17(domain1): 154946549 irq425: t5nex1:2a2:533 @cpu18(domain1): 163392585 irq426: t5nex1:2a3:535 @cpu19(domain1): 248248091 irq427: t5nex1:2a4:537 @cpu10(domain1): 151825795 irq428: t5nex1:2a5:539 @cpu11(domain1): 211623937 irq429: t5nex1:2a6:541 @cpu12(domain1): 146996842 irq430: t5nex1:2a7:543 @cpu13(domain1): 149654776 irq431: t5nex1:2a8:545 @cpu14(domain1): 159051009 irq432: t5nex1:2a9:547 @cpu15(domain1): 147511578 irq433: t5nex1:2aa:549 @cpu16(domain1): 151366677 irq434: t5nex1:2ab:551 @cpu17(domain1): 166419088 irq435: t5nex1:2ac:553 @cpu18(domain1): 155997667 irq436: t5nex1:2ad:555 @cpu19(domain1): 153777002 irq437: t5nex1:2ae:557 @cpu10(domain1): 148026677 irq438: t5nex1:2af:559 @cpu11(domain1): 146783174 irq439: t5nex1:3a0:561 @cpu12(domain1): 156624537 irq440: t5nex1:3a1:563 @cpu13(domain1): 173749953 irq441: t5nex1:3a2:565 @cpu14(domain1): 177033995 irq442: t5nex1:3a3:567 @cpu15(domain1): 173715859 irq443: t5nex1:3a4:569 @cpu16(domain1): 174333864 irq444: t5nex1:3a5:571 @cpu17(domain1): 157006064 irq445: t5nex1:3a6:573 @cpu18(domain1): 160822294 irq446: t5nex1:3a7:575 @cpu19(domain1): 153622866 irq447: t5nex1:3a8:577 @cpu10(domain1): 158965692 irq448: t5nex1:3a9:579 @cpu11(domain1): 153345040 irq449: t5nex1:3aa:581 @cpu12(domain1): 166902519 irq450: t5nex1:3ab:583 @cpu13(domain1): 159972013 irq451: t5nex1:3ac:585 @cpu14(domain1): 171917959 irq452: t5nex1:3ad:587 @cpu15(domain1): 166200690 irq453: t5nex1:3ae:589 @cpu16(domain1): 152933459 irq454: t5nex1:3af:591 @cpu17(domain1): 144512181
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9e1c0666-3dea-f946-24d4-e2dea48b30af>