From owner-freebsd-bugs@FreeBSD.ORG Tue Aug 5 19:20:04 2008 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8241A1065677 for ; Tue, 5 Aug 2008 19:20:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6EDB48FC2C for ; Tue, 5 Aug 2008 19:20:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m75JK4md095407 for ; Tue, 5 Aug 2008 19:20:04 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m75JK4Nn095405; Tue, 5 Aug 2008 19:20:04 GMT (envelope-from gnats) Date: Tue, 5 Aug 2008 19:20:04 GMT Message-Id: <200808051920.m75JK4Nn095405@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Jakub Jasinski Cc: Subject: Re: misc/126230: weird process cpu usage stats and weird load average X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Jakub Jasinski List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Aug 2008 19:20:04 -0000 The following reply was made to PR misc/126230; it has been noted by GNATS. From: Jakub Jasinski To: Kris Kennaway Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: misc/126230: weird process cpu usage stats and weird load average Date: Tue, 5 Aug 2008 15:27:51 -0400 The devices detached. But as you suggested, I recompiled the kernel w/o usb. mail2# vmstat -i interrupt total rate irq1: atkbd0 6 0 irq4: sio0 2281 4 irq14: ata0 58 0 irq18: ciss0 18638 37 cpu0: timer 985022 1998 irq257: bce1 804740 1632 cpu1: timer 976993 1981 cpu2: timer 976992 1981 cpu3: timer 976991 1981 Total 4741721 9618 Same thing is happening: (Maybe something is wrong with 'select' or some other socket thing... ???) Here's a shot of top with '22 running' processes (after rsync is activated)--what is sendmail doing eating up 8.89%, apcupsd 15.09%, syslogd 5.19%?: 59 processes: 22 running, 37 sle 4.62, 1.53, up 0+00:04:07 15:08:30 CPU: % user6 running, 50 sleeping, 3 lock Mem: 0.7 Active, 0.0M Inact,38.8M Wired, 5 3.0ache, 214M Bu57.5976M Free 87 31 Swap: 8192M Total, 8192M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COM 590 root 2 4 0 14112K 2516K CPU0 0 0:15 15.09% apc 930 root 1 4 0 17196K 12132K RUN 2 0:12 12.16% rsy 767 root 1 4 0 30108K 5208K select 3 0:09 8.89% sen 684 root 1 4 0 9940K 1260K select 3 0:08 7.67% aut 835 mailnull 2 20 0 4888K 1536K sigwai 2 0:06 5.76% mim 710 root 1 4 0 9940K 1260K select 3 0:06 5.57% aut 711 root 1 4 0 9940K 1260K select 0 0:06 5.47% aut 712 root 1 4 0 9940K 1260K select 0 0:06 5.47% aut 908 admin 1 4 0 32936K 4624K RUN 0 0:06 5.47% ssh 647 root 1 4 0 27284K 10392K select 3 0:06 5.37% per 752 root 1 4 0 20960K 3956K select 0 0:06 5.37% ssh 928 root 1 4 0 17196K 11844K RUN 3 0:06 5.27% rsy 818 mailnull 1 4 0 9252K 2024K select 0 0:06 5.27% mim 713 root 1 4 0 9940K 1260K select 2 0:06 5.27% aut 727 root 1 4 0 3604K 1224K select 0 0:06 5.27% cou 541 root 1 4 0 4684K 1352K *Giant 3 0:06 5.18% sys 702 root 1 4 0 3604K 1220K select 2 0:06 5.18% cou 709 root 1 4 0 9940K 1260K select 3 0:06 5.18% aut 474 root 1 4 0 1616K 628K *Giant 2 0:06 5.08% dev 717 root 1 4 0 3604K 1224K select 2 0:06 4.98% cou 892 admin 1 4 0 32936K 4620K CPU0 1 0:06 4.88% ssh Without rsync: 57 processes: 1 running, 56 sleeping CPU: % user, % nice, % system, % interrupt, % idle Mem: 132M Active, 643M Inact, 247M Wired, 52K Cache, 214M Buf, 6891M Free Swap: 8192M Total, 8192M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COM 590 root 2 4 0 14112K 2516K select 3 0:16 0.00% apc 767 root 1 44 0 30108K 5208K select 0 0:10 0.00% sen 684 root 1 44 0 9940K 1260K select 0 0:09 0.00% aut 711 root 1 44 0 9940K 1260K select 2 0:06 0.00% aut 835 mailnull 2 20 0 4888K 1536K sigwai 2 0:06 0.00% mim 818 mailnull 1 44 0 9252K 2024K select 3 0:06 0.00% mim 713 root 1 44 0 9940K 1260K select 1 0:06 0.00% aut 712 root 1 44 0 9940K 1260K select 0 0:06 0.00% aut 710 root 1 44 0 9940K 1260K select 3 0:06 0.00% aut 908 admin 1 46 0 32936K 4624K select 2 0:06 0.00% ssh 647 root 1 44 0 27284K 10392K select 2 0:06 0.00% per 541 root 1 44 0 4684K 1352K select 1 0:06 0.00% sys 702 root 1 4 0 3604K 1220K select 1 0:06 0.00% cou 892 admin 1 44 0 32936K 4620K select 0 0:06 0.00% ssh 727 root 1 4 0 3604K 1224K select 3 0:06 0.00% cou 709 root 1 44 0 9940K 1260K select 1 0:06 0.00% aut 693 root 1 4 0 3604K 1220K select 2 0:06 0.00% cou 717 root 1 4 0 3604K 1224K select 2 0:06 0.00% cou 752 root 1 4 0 20960K 3956K select 2 0:06 0.00% ssh 474 root 1 4 0 1616K 628K select 1 0:06 0.00% dev 732 mailnull 1 4 0 83628K 74432K accept 3 0:03 0.00% cla On Tue, 05 Aug 2008 20:08:36 +0200 Kris Kennaway wrote: > Jakub Jasinski wrote: > > uhci2: port > > 0x1040-0x105f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] > > uhci2: [ITHREAD] > > usb2: on uhci2 > > usb2: USB revision 1.0 > > uhub2: on > > usb2 uhub2: 2 ports with 2 removable, self powered > > ukbd0: on uhub2 > > kbd2 at ukbd0 > > ums0: on uhub2 > > ums0: 5 buttons and Z dir. > > > > Ok, so the keyboard and mouse are there. I detached them, killed > > moused, rerun rsync and I got the same weird top output: > > Are the devices still attached (i.e. did they detach)? What if you > compile a kernel with no USB support at all (obviously you won't be > able to log in with a USB keyboard, but you can still log in > remotely). > > Kris > > > > > > FreeBSD 7.0-RELEASE (CUSTOM) #2: Mon Aug 4 12:44:23 EDT 2008 > > last pid: 1258; load averages: 0.00, 0.00, 0.00 up 0+01:09:53 > > 13:20:17 last pid: 1266; load averages: 1.98, 0.55, 0.20 up 0 > > +01:11:33 13:21:57 58 processes: 1 running, 57 sleeping > > CPU states: 0.9% user, 0.0% nice, 36.0% system, 3.5% interrupt, > > 59.6% idle Mem: 142M Active, 59M Inact, 278M Wired, 48K Cache, 214M > > Buf, 7433M Free Swap: 8192M Total, 8192M Free > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU > > COMMAND > > 662 root 2 4 0 14112K 2504K select 3 0:15 12.65% apcupsd > > 1266 root 1 54 0 17196K 12112K select 0 0:03 10.50% rsync > > 756 root 1 4 0 9940K 1260K select 2 0:08 6.40% authdaemon > > 1259 root 1 48 0 7656K 2328K CPU1 1 0:02 5.27% top > > 981 root 1 4 0 30108K 5292K select 2 0:06 4.59% sendmail > > 1265 root 1 4 0 17196K 11824K select 3 0:02 4.59% rsync > > 790 root 1 4 0 9940K 1260K select 1 0:06 4.49% authdaemon > > 774 root 1 4 0 3604K 1220K select 2 0:06 4.49% couriertcp > > 791 root 1 4 0 9940K 1260K select 0 0:06 4.49% authdaemon > > > > etc. > > > > ...and why are apcupsd (connected via serial port), syslogd, > > couriertcpd, sendmail, etc. working a the _exact_ same time for the > > same duration at around the same WPCU%? > > > > Just for comparison, this is a summary of top on FreeBSD 6.2 (32 > > bit) on another non-deployed box running pretty much the same > > software (courier imap, clamav, spamassassin, etc.) after rsync is > > activated: 43.48% for one rsync process > > 1.96% for another > > 0.00% for everything else. > > > > On Tue, 5 Aug 2008 16:50:04 GMT > > Kris Kennaway wrote: > > > >> Jakub Jasinski wrote: > >> > I recompiled the kernel with the ULE scheduler and the only > >> > difference now is that every 'working' process has higher WCPU > >> > _after_ rsync is activated. Before rsync is activated all > >> > process have 0% WCPU. > >> > > >> > BTW, thanks for the help. > >> > > >> > [root@mail2 ~]# vmstat -i > >> > interrupt total rate > >> > irq1: atkbd0 78 0 > >> > irq4: sio0 2023 5 > >> > irq14: ata0 58 0 > >> > irq16: uhci0 ehci0 1 0 > >> > irq18: ciss0 uhci2 2618 6 > >> > irq22: uhci4 19 0 > >> > cpu0: timer 751994 1994 > >> > irq257: bce1 2271 6 > >> > cpu1: timer 744491 1974 > >> > cpu2: timer 744494 1974 > >> > cpu3: timer 744513 1974 > >> > Total 2992560 7937 > >> > >> Your system is sharing an interrupt between the ciss and one of > >> the USB controllers. This means that every time there is an > >> interrupt from disk I/O the USB drivers will also wake up to > >> process it. This could be causing things like moused and cups to > >> run. What devices are attached to uhci2? > >> > > > > >