Date: Mon, 11 Oct 2010 15:27:37 +1100 (EST) From: Ian Smith <smithi@nimnet.asn.au> To: kes-kes@yandex.ru Cc: freebsd-questions@freebsd.org Subject: Re[3]: How to obtain which interrupts cause system to hang? Message-ID: <20101011144425.Y2036@sola.nimnet.asn.au> In-Reply-To: <632460655.20101010192705@yandex.ru> References: <20101009204915.0360410656F1@hub.freebsd.org> <20101010161330.R2036@sola.nimnet.asn.au> <1076883893.20101010105041@yandex.ru> <20101010194711.Y2036@sola.nimnet.asn.au> <632460655.20101010192705@yandex.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 10 Oct 2010 19:27:05 +0300, kes-kes@yandex.ru wrote: > Hi, Ian. Hi Eugen, > >> >> 23.1%Sys 50.8%Intr 1.3%User 0.0%Nice 24.8%Idle %ozfod 1999 cpu0: time > >> >> | | | | | | | | | | | daefr > >> >> ============+++++++++++++++++++++++++> 6 prcfr > >> > >> IS> Yes, system and esp. interrupt time is heavy .. 23k context switches!? [..] > >> IS> Disable p4tcc if it's a modern CPU; that usually hurts more than helps. > >> IS> Disable polling if you're using that .. you haven't provided much info, > >> IS> like is this with any network load, despite nfe0 showing no interrupts? > > >> Polling is ON. Traffice is about 60Mbit/s routed from nfe0 to vlan4 on rl0 > >> when interrupts are happen traffic slow down to 25-30Mbit/s. > > IS> Out of my depth. If it's a net problem - maybe not - you may do better > IS> in freebsd-net@ if you provide enough information (dmesg plus ifconfig, > IS> vmstat -i etc, normally and while this problem is happening). [..] > >> >> How to obtain what nasty happen, which process take 36-50% of CPU > >> >> resource? > >> > >> IS> Try 'top -S'. It's almost certainly system process[es], not shown above. > > IS> Does that not show anything? Also, something like 'ps auxww | less' > IS> should show you what's using all that CPU. I'm out of wild clues. > > vpn_shadow# top -S > last pid: 57879; load averages: 0.12, 0.06, 0.05 up 1+18:37:39 19:19:14 Ok, this was taken when things were't so busy as the earlier 36-50% .. > 101 processes: 2 running, 83 sleeping, 16 waiting > CPU: 0.0% user, 0.0% nice, 14.3% system, 17.3% interrupt, 68.4% idle > Mem: 319M Active, 799M Inact, 354M Wired, 336K Cache, 213M Buf, 503M Free > Swap: 4063M Total, 4063M Free > > PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND > 11 root 1 171 ki31 0K 16K RUN 24.9H 86.47% idle: cpu0 > 14 root 1 -44 - 0K 16K WAIT 689:52 10.25% swi1: net > 2 root 1 -68 - 0K 16K sleep 207:35 4.69% ng_queue0 > 40 root 1 -68 - 0K 16K - 101:37 1.46% dummynet .. but still if you add up the TIMEs above here it comes to about 41.5 hours, all but about half an hour of your total uptime, most of which is consumed by the next three below, so swi1 and ng_queue look like what's using most CPU long-term. > 47 root 1 20 - 0K 16K syncer 5:29 0.29% syncer > 12 root 1 -32 - 0K 16K WAIT 14:48 0.00% swi4: clock sio > 15 root 1 -16 - 0K 16K - 5:39 0.00% yarrow > 986 root 1 44 0 5692K 1408K select 1:29 0.00% syslogd > 1054 bind 4 4 0 138M 113M kqread 1:22 0.00% named > 1162 clamav 1 4 0 4616K 1468K accept 0:59 0.00% smtp-gated Smells net-related to me, maybe polling, but like I said, I'm out of my depth. You should have enough info to take to freebsd-net@ anyway. cheers, Ian PS: I still think you should take the time to close PR kern/129103 :)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20101011144425.Y2036>