Date: Thu, 25 Apr 2002 13:36:15 -0500 (CDT) From: Joe Greco <jgreco@ns.sol.net> To: dillon@apollo.backplane.com (Matthew Dillon) Cc: freebsd-smp@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG Subject: Re: kernel trap 9 with interrupts disabled Message-ID: <200204251836.NAA41191@aurora.sol.net> In-Reply-To: <200204071822.g37IMbt34515@apollo.backplane.com> from "Matthew Dillon" at Apr 07, 2002 11:22:37 AM
next in thread | previous in thread | raw e-mail | index | archive | help
> Hmm. Maybe adjust the code to panic the machine when this > situation occurs, then see if you can get a kernel dump out > of it. Looks like I'll be doing that next. Any help available from anyone in looking at that? I'm not big into reading kernel dumps :-) > As to the load issue... that sounds like a classic priority > inversion problem. Check the 'nice' of all the processes in > the system and see if some nice'd-down processes are hogging > the cpu. 'ps axlww' in a big window. Hmmm. I did just notice something. I run setiathome everywhere using a little daemon that punts it down to idprio etc. I just tried to kill them and they didn't, and I looked again and it's because they're running at 0.0%, so then I idprio -t -<pid>'d them, and when I did that to the first one, my login session froze for the better part of a minute. It remained pingable but apparently unresponsive. Then it recovered. The second one went as expected. > Also look at the user cpu verses system cpu percentage to see > where the cpu is going. Here's top, any hints? (note: the names have been changed to protect the innocent) last pid: 3145; load averages: 13.60, 13.97, 14.05 up 18+14:27:19 13:26:35 63 processes: 15 running, 47 sleeping, 1 stopped CPU states: 4.5% user, 0.0% nice, 94.8% system, 0.6% interrupt, 0.0% idle Mem: 142M Active, 656M Inact, 145M Wired, 47M Cache, 112M Buf, 14M Free Swap: 2048M Total, 56K Used, 2048M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND 78128 useruser 63 0 34696K 33896K RUN 0 83:26 31.30% 31.30% nit 78596 useruser 64 0 18716K 17896K RUN 0 79:59 31.10% 31.10% nit 78959 useruser 64 0 15872K 14728K RUN 0 79:30 29.93% 29.93% nit 57493 use 63 0 6412K 5804K RUN 1 601:36 13.43% 13.43% perl 99887 useruser 63 0 14200K 10420K CPU1 1 3:26 13.09% 13.09% perl 99918 use 64 0 1060K 656K RUN 1 2:26 11.33% 11.33% funny 2059 useruser 63 0 2220K 1424K RUN 1 0:59 11.18% 11.18% grep 507 use 63 0 1060K 656K RUN 1 1:47 9.52% 9.52% funny 1363 use 61 0 1060K 632K RUN 0 0:57 8.98% 8.98% funny 2555 use 63 0 1060K 632K RUN 1 0:34 8.30% 8.30% funny 3127 use 62 0 1060K 596K RUN 0 0:02 9.38% 6.10% funny 1028 root 2 0 964K 572K select 1 42:59 2.73% 2.73% syslogd 3104 use 2 0 1060K 576K sbwait 0 0:01 1.55% 1.22% funny 2945 root 35 0 1996K 1148K CPU0 1 0:02 0.93% 0.93% top 3106 use 2 0 1060K 656K sbwait 1 0:01 1.12% 0.88% funny 3145 use 2 0 1060K 596K sbwait 0 0:00 9.00% 0.44% funny 99230 nobody 37 52 16556K 16424K RUN 1 182.4H 0.00% 0.00% setiathome 21867 nobody 37 52 16556K 16428K RUN 0 171.6H 0.00% 0.00% setiathome 966 root 2 0 1648K 744K select 0 4:27 0.00% 0.00% ntpd 945 bind 2 0 3300K 2608K select 0 2:53 0.00% 0.00% named-dns 1047 root 10 0 1228K 848K nanslp 0 1:22 0.00% 0.00% mon 893 root 10 0 1004K 652K nanslp 0 1:19 0.00% 0.00% cron 57483 use 2 0 896K 400K sbwait 0 1:09 0.00% 0.00% wont 895 root 2 0 2224K 1172K select 0 0:50 0.00% 0.00% sshd 57488 use 2 0 4600K 4096K sbwait 1 0:48 0.00% 0.00% perl 57496 use 2 0 896K 504K accept 1 0:42 0.00% 0.00% mrdata 5828 root 2 0 2308K 1688K select 1 0:24 0.00% 0.00% sshd 950 nobody 2 0 928K 380K select 0 0:10 0.00% 0.00% identd 887 daemon 2 0 904K 540K sbwait 0 0:07 0.00% 0.00% rwhod 72401 userus 3 0 2628K 2244K ttyin 1 0:05 0.00% 0.00% zsh 25432 root 2 0 2348K 1728K select 0 0:04 0.00% 0.00% sshd 72398 root 2 0 2308K 1412K select 0 0:03 0.00% 0.00% sshd 2014 userxx 3 0 2792K 2304K ttyin 0 0:03 0.00% 0.00% lynx 1676 root 2 0 2316K 1636K select 0 0:03 0.00% 0.00% sshd 25551 useruser 3 0 1484K 1068K ttyin 0 0:02 0.00% 0.00% tcsh 2778 root 36 0 1996K 1144K STOP 0 0:02 0.00% 0.00% top 1206 root 28 0 2308K 1632K RUN 0 0:02 0.00% 0.00% sshd 98311 useruser 10 0 640K 280K wait 0 0:01 0.00% 0.00% sh 2777 root 2 -20 1992K 1152K select 1 0:01 0.00% 0.00% top 1248 root 18 0 1384K 992K pause 1 0:00 0.00% 0.00% tcsh 1679 userxx 18 0 2400K 2064K pause 1 0:00 0.00% 0.00% zsh 1207 jgreco 18 0 1380K 992K pause 1 0:00 0.00% 0.00% tcsh 5904 useruser 3 0 1452K 1040K ttyin 1 0:00 0.00% 0.00% tcsh 1008 root 2 0 3320K 2156K select 1 0:00 0.00% 0.00% snmpd 99852 useruser 10 0 1028K 600K wait 0 0:00 0.00% 0.00% bash 98623 mailnull -6 0 2524K 1780K piperd 0 0:00 0.00% 0.00% sendmail 998 nobody 10 52 896K 492K wait 1 0:00 0.00% 0.00% setidaemon 2768 root 10 0 628K 268K wait 1 0:00 0.00% 0.00% sh 98280 useruser 10 0 628K 268K wait 1 0:00 0.00% 0.00% sh 2762 root 10 0 636K 276K wait 1 0:00 0.00% 0.00% sh 98293 useruser 10 0 640K 280K wait 0 0:00 0.00% 0.00% sh 99850 useruser 10 0 628K 268K wait 1 0:00 0.00% 0.00% sh 1036 root 3 0 948K 456K ttyin 1 0:00 0.00% 0.00% getty 1038 root 3 0 948K 456K ttyin 1 0:00 0.00% 0.00% getty 1039 root 10 0 636K 232K wait 0 0:00 0.00% 0.00% sh -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Markenitg Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200204251836.NAA41191>