Date: Mon, 26 Dec 2011 21:44:24 +0200 From: =?windows-1251?B?yu7t/Oru4iDF4uPl7ejp?= <kes-kes@yandex.ru> To: =?windows-1251?B?yu7t/Oru4iDF4uPl7ejp?= <kes-kes@yandex.ru> Cc: freebsd-questions@freebsd.org, Daniel Staal <DStaal@usa.net>, wishmaster <artemrts@ukr.net> Subject: Re[6]: high load system do not take all CPU time Message-ID: <9110154891.20111226214424@yandex.ru> In-Reply-To: <1765647434.20111226205211@yandex.ru> References: <3A4BDC1D114ED73D51E019BA@mac-pro.magehandbook.com> <1374625746.20111217102942@yandex.ru> <926001243.20111218194712@yandex.ru> <70251.1324270448.8859926110310105088@ffe16.ukr.net> <621310179.20111225181017@yandex.ru> <1765647434.20111226205211@yandex.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
Çäðàâñòâóéòå, Êîíüêîâ. Âû ïèñàëè 26 äåêàáðÿ 2011 ã., 20:52:11: ÊÅ> Çäðàâñòâóéòå, Êîíüêîâ. ÊÅ> Âû ïèñàëè 25 äåêàáðÿ 2011 ã., 18:10:17: ÊÅ>> Çäðàâñòâóéòå, wishmaster. ÊÅ>> Âû ïèñàëè 19 äåêàáðÿ 2011 ã., 6:54:08: w>>> --- Original message --- w>>> From: "Êîíüêîâ Åâãåíèé" <kes-kes@yandex.ru> w>>> To: "Daniel Staal" <DStaal@usa.net> w>>> Date: 18 December 2011, 19:47:40 w>>> Subject: Re[2]: high load system do not take all CPU time w>>> w>>> >>>> Çäðàâñòâóéòå, Daniel. >>>> >>>> Âû ïèñàëè 18 äåêàáðÿ 2011 ã., 17:52:00: >>>> >>>> DS> --As of December 17, 2011 10:29:42 AM +0200, Êîíüêîâ Åâãåíèé >>>> DS> is alleged to have said: >>>> >>>> >> How to debug why system do not use free CPU resouces? >>>> >> >>>> >> On this pictures you can see that CPU can not exceed 400tics >>>> >> http://piccy.info/view3/2368839/c9022754d5fcd64aff04482dd360b5b2/ >>>> >> http://piccy.info/view3/2368837/a12aeed98681ed10f1a22f5b5edc5abc/ >>>> >> http://piccy.info/view3/2368836/da6a67703af80eb0ab8088ab8421385c/ >>>> >> >>>> >> >>>> >> On these pictures you can see that problems begin with trafic on re0 >>>> >> when CPU load rise to "maximum" >>>> >> http://piccy.info/view3/2368834/512139edc56eea736881affcda490eca/ >>>> >> http://piccy.info/view3/2368827/d27aead22eff69fd1ec2b6aa15e2cea3/ >>>> >> >>>> >> But there is 25% CPU idle yet at that moment. >>>> >>>> DS> <snip> >>>> >>>> >># top -SIHP >>>> >> last pid: 93050; load averages: 1.45, 1.41, 1.29 >>>> >> up 9+16:32:06 10:28:43 237 processes: 5 running, 210 sleeping, 2 >>>> >> stopped, 20 waiting >>>> >> CPU 0: 0.8% user, 0.0% nice, 8.7% system, 17.7% interrupt, 72.8% idle >>>> >> CPU 1: 0.0% user, 0.0% nice, 9.1% system, 20.1% interrupt, 70.9% idle >>>> >> CPU 2: 0.4% user, 0.0% nice, 9.4% system, 19.7% interrupt, 70.5% idle >>>> >> CPU 3: 1.2% user, 0.0% nice, 6.3% system, 22.4% interrupt, 70.1% idle >>>> >> Mem: 843M Active, 2476M Inact, 347M Wired, 150M Cache, 112M Buf, 80M Free >>>> >> Swap: 4096M Total, 15M Used, 4080M Free >>>> >>>> DS> --As for the rest, it is mine. >>>> >>>> DS> You are I/O bound; most of your time is spent in interrupts. The CPU is >>>> DS> dealing with things as fast as it can get them, but it has to wait for the >>>> DS> disk and/or network card to get them to it. The CPU is not your problem; >>>> DS> if you need more performance, you need to tune the I/O. (And possibly get >>>> DS> better I/O cards, if available.) >>>> >>>> DS> Daniel T. Staal >>>> >>>> can I get interrupt limit or calculate it before that limit is >>>> reached? >>>> >>>> interrupt source is internal card: >>>> # vmstat -i >>>> interrupt total rate >>>> irq14: ata0 349756 78 >>>> irq16: ehci0 7427 1 >>>> irq23: ehci1 12150 2 >>>> cpu0:timer 18268704 4122 >>>> irq256: re0 85001260 19178 >>>> cpu1:timer 18262192 4120 >>>> cpu2:timer 18217064 4110 >>>> cpu3:timer 18210509 4108 >>>> Total 158329062 35724 >>>> >>>> Have you any good I/O tuning links to read? >>>> >>>> -- >>>> Ñ óâàæåíèåì, >>>> Êîíüêîâ mailto:kes-kes@yandex.ru w>>> w>>> Your problem is in the poor performance LAN Card. Guy from w>>> Calomel Org told you about it. He advised you to change to Intel Network Card. ÊÅ>> see at time 17:20 ÊÅ>> http://piccy.info/view3/2404329/dd9f28f8ac74d3d2f698ff14c305fe31/ ÊÅ>> at this point freeradius start to work slow because of no CPU time is ÊÅ>> allocated to it or is allocated to little and mpd5 start to drop users because of no response ÊÅ>> from radius. I do not know what idle were on 'top', sadly. ÊÅ>> does SNMP return right values for CPU usage? ÊÅ> last pid: 14445; load averages: 6.88, 5.69, 5.33 up 0+12:11:35 20:37:57 ÊÅ> 244 processes: 12 running, 211 sleeping, 3 stopped, 15 waiting, 3 lock ÊÅ> CPU 0: 4.7% user, 0.0% nice, 13.3% system, 46.7% interrupt, 35.3% idle ÊÅ> CPU 1: 2.0% user, 0.0% nice, 9.8% system, 69.4% interrupt, 18.8% idle ÊÅ> CPU 2: 2.7% user, 0.0% nice, 8.2% system, 74.5% interrupt, 14.5% idle ÊÅ> CPU 3: 1.2% user, 0.0% nice, 9.4% system, 78.0% interrupt, 11.4% idle ÊÅ> Mem: 800M Active, 2708M Inact, 237M Wired, 60M Cache, 112M Buf, 93M Free ÊÅ> Swap: 4096M Total, 25M Used, 4071M Free ÊÅ> PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND ÊÅ> 12 root -72 - 0K 160K CPU1 1 159:49 100.00% {swi1: netisr 3} ÊÅ> 12 root -72 - 0K 160K *per-i 2 101:25 84.57% {swi1: netisr 1} ÊÅ> 12 root -72 - 0K 160K *per-i 3 60:10 40.72% {swi1: netisr 2} ÊÅ> 12 root -72 - 0K 160K *per-i 2 41:54 39.26% {swi1: netisr 0} ÊÅ> 11 root 155 ki31 0K 32K RUN 0 533:06 24.46% {idle: cpu0} ÊÅ> 3639 root 36 0 10460K 3824K CPU3 3 7:43 22.17% zebra ÊÅ> 12 root -92 - 0K 160K CPU0 0 93:56 14.94% {irq256: re0} ÊÅ> 11 root 155 ki31 0K 32K RUN 1 563:29 14.16% {idle: cpu1} ÊÅ> 11 root 155 ki31 0K 32K RUN 2 551:46 12.79% {idle: cpu2} ÊÅ> 11 root 155 ki31 0K 32K RUN 3 558:54 11.52% {idle: cpu3} ÊÅ> 13 root -16 - 0K 32K sleep 3 16:56 4.93% {ng_queue2} ÊÅ> 13 root -16 - 0K 32K RUN 2 16:56 4.69% {ng_queue0} ÊÅ> 13 root -16 - 0K 32K RUN 0 16:56 4.54% {ng_queue1} ÊÅ> 13 root -16 - 0K 32K RUN 1 16:59 4.44% {ng_queue3} ÊÅ> 6818 root 22 0 15392K 4836K select 2 25:16 4.10% snmpd ÊÅ> 49448 freeradius 29 0 27748K 16984K select 3 2:37 2.59% {initial thread} ÊÅ> 16118 firebird 20 -10 233M 145M usem 2 0:06 0.83% {fb_smp_server} ÊÅ> 14282 cacti 21 0 12000K 3084K select 3 0:00 0.68% snmpwalk ÊÅ> 16118 firebird 20 -10 233M 145M usem 0 0:03 0.54% {fb_smp_server} ÊÅ> 5572 root 21 0 136M 78284K wait 1 5:23 0.49% {mpd5} ÊÅ> 14507 root 20 0 9536K 1148K nanslp 0 0:51 0.15% monitord ÊÅ> 14441 root 25 0 11596K 4048K CPU0 0 0:00 0.00% perl5.14.1 ÊÅ> 14443 cacti 21 0 11476K 2920K piperd 0 0:00 0.00% perl5.14.1 ÊÅ> 14444 root 22 0 9728K 1744K select 0 0:00 0.00% sudo ÊÅ> 14445 root 21 0 9672K 1240K kqread 0 0:00 0.00% ping ÊÅ> # vmstat -i ÊÅ> interrupt total rate ÊÅ> irq14: ata0 1577446 35 ÊÅ> irq16: ehci0 66968 1 ÊÅ> irq23: ehci1 94012 2 ÊÅ> cpu0:timer 180767557 4122 ÊÅ> irq256: re0 683483519 15587 ÊÅ> cpu1:timer 180031511 4105 ÊÅ> cpu3:timer 175311179 3998 ÊÅ> cpu2:timer 179460055 4092 ÊÅ> Total 1400792247 31947 ÊÅ> 1 users Load 6.02 5.59 5.31 Dec 26 20:38 ÊÅ> Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER ÊÅ> Tot Share Tot Share Free in out in out ÊÅ> Act 1022276 12900 3562636 39576 208992 count 4 ÊÅ> All 1143548 20380 5806292 100876 pages 48 ÊÅ> Proc: Interrupts ÊÅ> r p d s w Csw Trp Sys Int Sof Flt 1135 cow 37428 total ÊÅ> 186 129k 10k 17k 21k 14k 5857 2348 zfod 15 ata0 14 ÊÅ> 184 ozfod 1 ehci0 16 ÊÅ> 8.1%Sys 68.4%Intr 5.9%User 0.0%Nice 17.6%Idle 7%ozfod 2 ehci1 23 ÊÅ> | | | | | | | | | | | daefr 4120 cpu0:timer ÊÅ> ====++++++++++++++++++++++++++++++++++>>> 2423 prcfr 21013 re0 256 ÊÅ> 208 dtbuf 4425 totfr 4100 cpu1:timer ÊÅ> Namei Name-cache Dir-cache 142271 desvn react 4083 cpu3:timer ÊÅ> Calls hits % hits % 3750 numvn pdwak 4094 cpu2:timer ÊÅ> 36571 36546 100 1998 frevn pdpgs ÊÅ> intrn ÊÅ> Disks ad0 da0 pass0 241412 wire ÊÅ> KB/t 26.81 0.00 0.00 826884 act ÊÅ> tps 15 0 0 2714240 inact ÊÅ> MB/s 0.39 0.00 0.00 97284 cache ÊÅ> %busy 1 0 0 111708 free ÊÅ> 114976 buf ÊÅ> # netstat -w 1 -I re0 ÊÅ> input (re0) output ÊÅ> packets errs idrops bytes packets errs bytes colls ÊÅ> 52329 0 0 40219676 58513 0 40189497 0 ÊÅ> 50207 0 0 37985881 57340 0 38438634 0 ÊÅ> http://piccy.info/view3/2409691/69d31186d8943a53c31ec193c8dfe79d/ ÊÅ> http://piccy.info/view3/2409746/efb444ffe892592fbd6f025fd14535c4/ ÊÅ> before overload happen, as you can see, server passthrought more traffic. ÊÅ> programs at this moment works very sloooow! ÊÅ> at the day on re0 there are can be more interrupts than now and server works fine ÊÅ> some problems with scheduler I think. and three is *radix state. last pid: 51533; load averages: 4.67, 5.24, 5.29 up 0+12:59:43 21:26:05 284 processes: 6 running, 255 sleeping, 3 stopped, 17 waiting, 3 lock CPU 0: 0.5% user, 0.0% nice, 15.2% system, 27.2% interrupt, 57.1% idle CPU 1: 0.0% user, 0.0% nice, 20.1% system, 22.3% interrupt, 57.6% idle CPU 2: 1.6% user, 0.0% nice, 29.3% system, 20.7% interrupt, 48.4% idle CPU 3: 2.7% user, 0.0% nice, 21.7% system, 16.3% interrupt, 59.2% idle Mem: 788M Active, 2660M Inact, 239M Wired, 81M Cache, 112M Buf, 129M Free Swap: 4096M Total, 51M Used, 4045M Free, 1% Inuse PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 51239 root -72 0 10460K 3416K CPU0 0 0:15 66.80% zebra 11 root 155 ki31 0K 32K CPU3 3 565:03 46.53% {idle: cpu3} 11 root 155 ki31 0K 32K RUN 1 571:46 45.70% {idle: cpu1} 11 root 155 ki31 0K 32K RUN 2 558:13 44.73% {idle: cpu2} 11 root 155 ki31 0K 32K CPU0 0 546:21 43.85% {idle: cpu0} 12 root -72 - 0K 160K *radix 1 204:13 42.14% {swi1: netisr 3} 12 root -72 - 0K 160K *radix 2 141:57 37.55% {swi1: netisr 1} 12 root -72 - 0K 160K *radix 3 61:10 25.15% {swi1: netisr 0} 12 root -72 - 0K 160K WAIT 3 78:28 19.92% {swi1: netisr 2} 12 root -92 - 0K 160K WAIT 0 100:28 9.13% {irq256: re0} 6818 root 22 0 15392K 4836K select 1 26:59 2.10% snmpd 13 root -16 - 0K 32K sleep 3 19:24 1.56% {ng_queue1} 51531 cacti 36 0 17092K 5944K select 0 0:00 1.51% {initial thread} 13 root -16 - 0K 32K sleep 3 19:27 1.46% {ng_queue3} 13 root -16 - 0K 32K sleep 3 19:24 1.46% {ng_queue2} 13 root -16 - 0K 32K sleep 1 19:25 1.42% {ng_queue0} 51531 cacti 52 0 17092K 5944K usem 0 0:00 1.42% {perl5.14.1} 51510 cacti 46 0 32256K 16304K piperd 3 0:00 1.22% php 51514 cacti 46 0 11476K 2940K piperd 2 0:00 1.22% perl5.14.1 51515 root 46 0 9728K 1748K select 3 0:00 1.22% sudo 51516 root 45 0 9672K 1220K kqread 1 0:00 1.22% ping 51508 cacti 52 0 32256K 16312K piperd 2 0:00 1.03% php 51248 root 4 0 10564K 4980K select 0 0:00 0.44% bgpd 5572 root 20 -15 136M 64812K select 1 6:10 0.34% {mpd5} 51502 cacti 25 0 32256K 16568K nanslp 0 0:00 0.34% php 51513 cacti 23 0 17772K 4436K piperd 1 0:00 0.34% rrdtool 5572 root 20 -15 136M 64812K select 2 0:00 0.34% {mpd5} 5572 root 20 -15 136M 64812K select 1 0:00 0.34% {mpd5} 5572 root 20 -15 136M 64812K select 1 0:00 0.34% {mpd5} I am trying to google about *radix and *per-i but I did not find anything ( -- Ñ óâàæåíèåì, Êîíüêîâ mailto:kes-kes@yandex.ru
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9110154891.20111226214424>