From owner-freebsd-current@FreeBSD.ORG Mon Jul 7 21:49:14 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 47E4337B401; Mon, 7 Jul 2003 21:49:14 -0700 (PDT) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8FA0A43FD7; Mon, 7 Jul 2003 21:49:13 -0700 (PDT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.9/8.12.9) id h684nDBJ084013; Mon, 7 Jul 2003 23:49:13 -0500 (CDT) (envelope-from dan) Date: Mon, 7 Jul 2003 23:49:13 -0500 From: Dan Nelson To: Andy Farkas Message-ID: <20030708044912.GF87950@dan.emsphone.com> References: <20030708035309.GE87950@dan.emsphone.com> <20030708135908.I6312-100000@hewey.af.speednet.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030708135908.I6312-100000@hewey.af.speednet.com.au> X-OS: FreeBSD 5.1-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.4i cc: freebsd-current@freebsd.org cc: freebsd-smp@freebsd.org Subject: Re: whats going on with the scheduler? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Jul 2003 04:49:14 -0000 In the last episode (Jul 08), Andy Farkas said: > On Mon, 7 Jul 2003, Dan Nelson wrote: > > > I bet those *Giants have something to do with it... > > > > Most likely. That means they're waiting for some other process to > > release the big Giant kernel lock. Paste in top's header so we can see > > how many processes are locked, and what the system cpu percentage is. > > This is what top looks like (up to the 1st 0.00% process) when sitting > idle* with 3 setiathomes: > > 97 processes: 9 running, 71 sleeping, 4 zombie, 12 waiting, 1 lock > CPU states: 4.0% user, 72.0% nice, 4.6% system, 0.7% interrupt, 18.8% idle > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND > 42946 setiathome 139 15 16552K 15984K RUN 0 43.8H 98.00% 98.00% setiathome > 42945 setiathome 139 15 16944K 15732K CPU1 1 43.0H 97.56% 97.56% setiathome > 42947 setiathome 139 15 15524K 14956K CPU0 2 42.9H 94.14% 94.14% setiathome > > Note how the seti procs are getting 94-98% cpu time. > > When I do my scp thing, top looks like this: > > 98 processes: 8 running, 71 sleeping, 4 zombie, 12 waiting, 3 lock > CPU states: 1.7% user, 33.7% nice, 20.1% system, 0.6% interrupt, 43.9% idle > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND > 42946 setiathome 139 15 16552K 15984K CPU3 2 44.0H 68.41% 68.41% setiathome > 50296 andyf 125 0 3084K 2176K RUN 2 7:55 64.21% 64.21% ssh > 12 root -16 0 0K 12K CPU2 2 153.6H 48.78% 48.78% idle: cpu2 > 11 root -16 0 0K 12K CPU3 3 153.6H 48.63% 48.63% idle: cpu3 > 13 root -16 0 0K 12K RUN 1 150.2H 48.44% 48.44% idle: cpu1 > 14 root -16 0 0K 12K RUN 0 144.8H 45.31% 45.31% idle: cpu0 > 42947 setiathome 130 15 15524K 14956K RUN 2 43.1H 28.56% 28.56% setiathome > 42945 setiathome 125 15 15916K 14700K RUN 0 43.2H 25.05% 25.05% setiathome > > Notice how 'nice' has gone to 33.7% and 'idle' to 43.9%, and the seti > procs have dropped well below 94%. > > > A truss of one of the seti processes may be useful too. setiathome > > really shouldn't be doing many syscalls at all. > > If setiathome is making lots of syscalls, then running the 3 instanses > should already show a problem, no? Not if it's ssh that's holding Giant for longer than it should. The setiathome processes may be calling some really fast syscall 500 times a second which doesn't cause a problem until ssh comes along and calls some other syscall that takes .1 ms to return but also locks Giant long enough to cause the other processes to all back up behind it. -- Dan Nelson dnelson@allantgroup.com