Date: Mon, 26 Dec 2011 20:52:11 +0200 From: =?windows-1251?B?yu7t/Oru4iDF4uPl7ejp?= <kes-kes@yandex.ru> To: =?windows-1251?B?yu7t/Oru4iDF4uPl7ejp?= <kes-kes@yandex.ru> Cc: freebsd-questions@freebsd.org, Daniel Staal <DStaal@usa.net>, wishmaster <artemrts@ukr.net> Subject: Re[5]: high load system do not take all CPU time Message-ID: <1765647434.20111226205211@yandex.ru> In-Reply-To: <621310179.20111225181017@yandex.ru> References: <3A4BDC1D114ED73D51E019BA@mac-pro.magehandbook.com> <1374625746.20111217102942@yandex.ru> <926001243.20111218194712@yandex.ru> <70251.1324270448.8859926110310105088@ffe16.ukr.net> <621310179.20111225181017@yandex.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
Здравствуйте, Коньков. Вы писали 25 декабря 2011 г., 18:10:17: КЕ> Здравствуйте, wishmaster. КЕ> Вы писали 19 декабря 2011 г., 6:54:08: w>> --- Original message --- w>> From: "Коньков Евгений" <kes-kes@yandex.ru> w>> To: "Daniel Staal" <DStaal@usa.net> w>> Date: 18 December 2011, 19:47:40 w>> Subject: Re[2]: high load system do not take all CPU time w>> w>> >>> Здравствуйте, Daniel. >>> >>> Вы писали 18 декабря 2011 г., 17:52:00: >>> >>> DS> --As of December 17, 2011 10:29:42 AM +0200, Коньков Евгений >>> DS> is alleged to have said: >>> >>> >> How to debug why system do not use free CPU resouces? >>> >> >>> >> On this pictures you can see that CPU can not exceed 400tics >>> >> http://piccy.info/view3/2368839/c9022754d5fcd64aff04482dd360b5b2/ >>> >> http://piccy.info/view3/2368837/a12aeed98681ed10f1a22f5b5edc5abc/ >>> >> http://piccy.info/view3/2368836/da6a67703af80eb0ab8088ab8421385c/ >>> >> >>> >> >>> >> On these pictures you can see that problems begin with trafic on re0 >>> >> when CPU load rise to "maximum" >>> >> http://piccy.info/view3/2368834/512139edc56eea736881affcda490eca/ >>> >> http://piccy.info/view3/2368827/d27aead22eff69fd1ec2b6aa15e2cea3/ >>> >> >>> >> But there is 25% CPU idle yet at that moment. >>> >>> DS> <snip> >>> >>> >># top -SIHP >>> >> last pid: 93050; load averages: 1.45, 1.41, 1.29 >>> >> up 9+16:32:06 10:28:43 237 processes: 5 running, 210 sleeping, 2 >>> >> stopped, 20 waiting >>> >> CPU 0: 0.8% user, 0.0% nice, 8.7% system, 17.7% interrupt, 72.8% idle >>> >> CPU 1: 0.0% user, 0.0% nice, 9.1% system, 20.1% interrupt, 70.9% idle >>> >> CPU 2: 0.4% user, 0.0% nice, 9.4% system, 19.7% interrupt, 70.5% idle >>> >> CPU 3: 1.2% user, 0.0% nice, 6.3% system, 22.4% interrupt, 70.1% idle >>> >> Mem: 843M Active, 2476M Inact, 347M Wired, 150M Cache, 112M Buf, 80M Free >>> >> Swap: 4096M Total, 15M Used, 4080M Free >>> >>> DS> --As for the rest, it is mine. >>> >>> DS> You are I/O bound; most of your time is spent in interrupts. The CPU is >>> DS> dealing with things as fast as it can get them, but it has to wait for the >>> DS> disk and/or network card to get them to it. The CPU is not your problem; >>> DS> if you need more performance, you need to tune the I/O. (And possibly get >>> DS> better I/O cards, if available.) >>> >>> DS> Daniel T. Staal >>> >>> can I get interrupt limit or calculate it before that limit is >>> reached? >>> >>> interrupt source is internal card: >>> # vmstat -i >>> interrupt total rate >>> irq14: ata0 349756 78 >>> irq16: ehci0 7427 1 >>> irq23: ehci1 12150 2 >>> cpu0:timer 18268704 4122 >>> irq256: re0 85001260 19178 >>> cpu1:timer 18262192 4120 >>> cpu2:timer 18217064 4110 >>> cpu3:timer 18210509 4108 >>> Total 158329062 35724 >>> >>> Have you any good I/O tuning links to read? >>> >>> -- >>> С уважением, >>> Коньков mailto:kes-kes@yandex.ru w>> w>> Your problem is in the poor performance LAN Card. Guy from w>> Calomel Org told you about it. He advised you to change to Intel Network Card. КЕ> see at time 17:20 КЕ> http://piccy.info/view3/2404329/dd9f28f8ac74d3d2f698ff14c305fe31/ КЕ> at this point freeradius start to work slow because of no CPU time is КЕ> allocated to it or is allocated to little and mpd5 start to drop users because of no response КЕ> from radius. I do not know what idle were on 'top', sadly. КЕ> does SNMP return right values for CPU usage? last pid: 14445; load averages: 6.88, 5.69, 5.33 up 0+12:11:35 20:37:57 244 processes: 12 running, 211 sleeping, 3 stopped, 15 waiting, 3 lock CPU 0: 4.7% user, 0.0% nice, 13.3% system, 46.7% interrupt, 35.3% idle CPU 1: 2.0% user, 0.0% nice, 9.8% system, 69.4% interrupt, 18.8% idle CPU 2: 2.7% user, 0.0% nice, 8.2% system, 74.5% interrupt, 14.5% idle CPU 3: 1.2% user, 0.0% nice, 9.4% system, 78.0% interrupt, 11.4% idle Mem: 800M Active, 2708M Inact, 237M Wired, 60M Cache, 112M Buf, 93M Free Swap: 4096M Total, 25M Used, 4071M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 12 root -72 - 0K 160K CPU1 1 159:49 100.00% {swi1: netisr 3} 12 root -72 - 0K 160K *per-i 2 101:25 84.57% {swi1: netisr 1} 12 root -72 - 0K 160K *per-i 3 60:10 40.72% {swi1: netisr 2} 12 root -72 - 0K 160K *per-i 2 41:54 39.26% {swi1: netisr 0} 11 root 155 ki31 0K 32K RUN 0 533:06 24.46% {idle: cpu0} 3639 root 36 0 10460K 3824K CPU3 3 7:43 22.17% zebra 12 root -92 - 0K 160K CPU0 0 93:56 14.94% {irq256: re0} 11 root 155 ki31 0K 32K RUN 1 563:29 14.16% {idle: cpu1} 11 root 155 ki31 0K 32K RUN 2 551:46 12.79% {idle: cpu2} 11 root 155 ki31 0K 32K RUN 3 558:54 11.52% {idle: cpu3} 13 root -16 - 0K 32K sleep 3 16:56 4.93% {ng_queue2} 13 root -16 - 0K 32K RUN 2 16:56 4.69% {ng_queue0} 13 root -16 - 0K 32K RUN 0 16:56 4.54% {ng_queue1} 13 root -16 - 0K 32K RUN 1 16:59 4.44% {ng_queue3} 6818 root 22 0 15392K 4836K select 2 25:16 4.10% snmpd 49448 freeradius 29 0 27748K 16984K select 3 2:37 2.59% {initial thread} 16118 firebird 20 -10 233M 145M usem 2 0:06 0.83% {fb_smp_server} 14282 cacti 21 0 12000K 3084K select 3 0:00 0.68% snmpwalk 16118 firebird 20 -10 233M 145M usem 0 0:03 0.54% {fb_smp_server} 5572 root 21 0 136M 78284K wait 1 5:23 0.49% {mpd5} 14507 root 20 0 9536K 1148K nanslp 0 0:51 0.15% monitord 14441 root 25 0 11596K 4048K CPU0 0 0:00 0.00% perl5.14.1 14443 cacti 21 0 11476K 2920K piperd 0 0:00 0.00% perl5.14.1 14444 root 22 0 9728K 1744K select 0 0:00 0.00% sudo 14445 root 21 0 9672K 1240K kqread 0 0:00 0.00% ping # vmstat -i interrupt total rate irq14: ata0 1577446 35 irq16: ehci0 66968 1 irq23: ehci1 94012 2 cpu0:timer 180767557 4122 irq256: re0 683483519 15587 cpu1:timer 180031511 4105 cpu3:timer 175311179 3998 cpu2:timer 179460055 4092 Total 1400792247 31947 1 users Load 6.02 5.59 5.31 Dec 26 20:38 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 1022276 12900 3562636 39576 208992 count 4 All 1143548 20380 5806292 100876 pages 48 Proc: Interrupts r p d s w Csw Trp Sys Int Sof Flt 1135 cow 37428 total 186 129k 10k 17k 21k 14k 5857 2348 zfod 15 ata0 14 184 ozfod 1 ehci0 16 8.1%Sys 68.4%Intr 5.9%User 0.0%Nice 17.6%Idle 7%ozfod 2 ehci1 23 | | | | | | | | | | | daefr 4120 cpu0:timer ====++++++++++++++++++++++++++++++++++>>> 2423 prcfr 21013 re0 256 208 dtbuf 4425 totfr 4100 cpu1:timer Namei Name-cache Dir-cache 142271 desvn react 4083 cpu3:timer Calls hits % hits % 3750 numvn pdwak 4094 cpu2:timer 36571 36546 100 1998 frevn pdpgs intrn Disks ad0 da0 pass0 241412 wire KB/t 26.81 0.00 0.00 826884 act tps 15 0 0 2714240 inact MB/s 0.39 0.00 0.00 97284 cache %busy 1 0 0 111708 free 114976 buf # netstat -w 1 -I re0 input (re0) output packets errs idrops bytes packets errs bytes colls 52329 0 0 40219676 58513 0 40189497 0 50207 0 0 37985881 57340 0 38438634 0 http://piccy.info/view3/2409691/69d31186d8943a53c31ec193c8dfe79d/ http://piccy.info/view3/2409746/efb444ffe892592fbd6f025fd14535c4/ before overload happen, as you can see, server passthrought more traffic. programs at this moment works very sloooow! at the day on re0 there are can be more interrupts than now and server works fine some problems with scheduler I think. -- С уважением, Коньков mailto:kes-kes@yandex.ru
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1765647434.20111226205211>