From owner-freebsd-current@FreeBSD.ORG Tue Jul 15 18:11:07 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D81F1065671 for ; Tue, 15 Jul 2008 18:11:07 +0000 (UTC) (envelope-from stephen@math.missouri.edu) Received: from cauchy.math.missouri.edu (cauchy.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 3D2268FC0A for ; Tue, 15 Jul 2008 18:11:07 +0000 (UTC) (envelope-from stephen@math.missouri.edu) Received: from laptop3.gateway.2wire.net (cauchy.math.missouri.edu [128.206.184.213]) by cauchy.math.missouri.edu (8.14.2/8.14.2) with ESMTP id m6FIA4sR007338; Tue, 15 Jul 2008 13:10:04 -0500 (CDT) (envelope-from stephen@math.missouri.edu) Message-ID: <487CE839.3080507@math.missouri.edu> Date: Tue, 15 Jul 2008 13:11:05 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.8.1.15) Gecko/20080713 SeaMonkey/1.1.10 MIME-Version: 1.0 To: Steve Kargl References: <20080715175944.GA80901@troutmask.apl.washington.edu> In-Reply-To: <20080715175944.GA80901@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org Subject: Re: ULE scheduling oddity X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Jul 2008 18:11:07 -0000 Steve Kargl wrote: > It appears that the ULE scheduler is not providing a fair > slice to running processes. > > I have a dual-cpu, quad-core opteron based system with > node21:kargl[229] uname -a > FreeBSD node21.cimu.org 8.0-CURRENT FreeBSD 8.0-CURRENT #3: > Wed Jun 4 16:22:49 PDT 2008 kargl@node10.cimu.org:src/sys/HPC amd64 > > If I start exactly 8 processes, each gets 100% WCPU according to > top. If I add to additional processes, then I observe > > last pid: 3874; load averages: 9.99, 9.76, 9.43 up 0+19:54:44 10:51:18 > 41 processes: 11 running, 30 sleeping > CPU: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle > Mem: 5706M Active, 8816K Inact, 169M Wired, 84K Cache, 108M Buf, 25G Free > Swap: 4096M Total, 4096M Free > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 3836 kargl 1 118 0 577M 572M CPU7 7 6:37 100.00% kzk90 > 3839 kargl 1 118 0 577M 572M CPU2 2 6:36 100.00% kzk90 > 3849 kargl 1 118 0 577M 572M CPU3 3 6:33 100.00% kzk90 > 3852 kargl 1 118 0 577M 572M CPU0 0 6:25 100.00% kzk90 > 3864 kargl 1 118 0 577M 572M RUN 1 6:24 100.00% kzk90 > 3858 kargl 1 112 0 577M 572M RUN 5 4:10 78.47% kzk90 > 3855 kargl 1 110 0 577M 572M CPU5 5 4:29 67.97% kzk90 > 3842 kargl 1 110 0 577M 572M CPU4 4 4:24 66.70% kzk90 > 3846 kargl 1 107 0 577M 572M RUN 6 3:22 53.96% kzk90 > 3861 kargl 1 107 0 577M 572M CPU6 6 3:15 53.37% kzk90 My personal experience is that WCPU is not that accurate a measure of what is really going on. It is some kind of weighted CPU time, and according to the man page you have to wait for up to a minute to get an accurate sense. What I tend to do is to look at the TIME's, and see how fast they tick. Also, you can run the programs thus: time ./kargl and the times produced at the end tend to be a rather good measure of actual percentage cpu time. Although I can see that in your situation that this might be tricky to use. There is also a -C option with top that gives "raw CPU" time. I have never tried it, so I cannot speak to how good it really is. Stephen