Date: Wed, 16 Jul 2008 14:13:17 -0700 From: Steve Kargl <sgk@troutmask.apl.washington.edu> To: Barney Cordoba <barney_cordoba@yahoo.com> Cc: current@freebsd.org Subject: Re: ULE scheduling oddity Message-ID: <20080716211317.GA92354@troutmask.apl.washington.edu> In-Reply-To: <565436.13205.qm@web63915.mail.re1.yahoo.com> References: <20080715175944.GA80901@troutmask.apl.washington.edu> <565436.13205.qm@web63915.mail.re1.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jul 16, 2008 at 07:49:03AM -0700, Barney Cordoba wrote: > --- On Tue, 7/15/08, Steve Kargl <sgk@troutmask.apl.washington.edu> wrote: > > last pid: 3874; load averages: 9.99, 9.76, 9.43 up 0+19:54:44 10:51:18 > > 41 processes: 11 running, 30 sleeping > > CPU: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle > > Mem: 5706M Active, 8816K Inact, 169M Wired, 84K Cache, 108M > > Buf, 25G Free > > Swap: 4096M Total, 4096M Free > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > > 3836 kargl 1 118 0 577M 572M CPU7 7 6:37 100.00% kzk90 > > 3839 kargl 1 118 0 577M 572M CPU2 2 6:36 100.00% kzk90 > > 3849 kargl 1 118 0 577M 572M CPU3 3 6:33 100.00% kzk90 > > 3852 kargl 1 118 0 577M 572M CPU0 0 6:25 100.00% kzk90 > > 3864 kargl 1 118 0 577M 572M RUN 1 6:24 100.00% kzk90 > > 3858 kargl 1 112 0 577M 572M RUN 5 4:10 78.47% kzk90 > > 3855 kargl 1 110 0 577M 572M CPU5 5 4:29 67.97% kzk90 > > 3842 kargl 1 110 0 577M 572M CPU4 4 4:24 66.70% kzk90 > > 3846 kargl 1 107 0 577M 572M RUN 6 3:22 53.96% kzk90 > > 3861 kargl 1 107 0 577M 572M CPU6 6 3:15 53.37% kzk90 > > > > I would have expected to see a more evenly distributed WCPU > > of around 80% for each process. > > I don't see why "equal" distribution is or should be a goal, as that > does not guarantee optimization. The above images may be parts of an MPI application. Synchronization problems simply kill performance. The PIDs with 100% WCPU could be spinning in a loop waiting for PID 3861 to send a message after completing a computation. The factor of 2 difference in TIME for PID 3836 and 3861 was still observed after more than an hour of accumulated time for 3836. It appears as if the algorithm for cpu affinity is punishing 3846 and 3861. > Given that the cache is shared between only 2 cpus, it might very well > be more efficient to run on 2 CPUs when the 3rd or 4th isn't needed. > > It works pretty darn well, IMO. Its not like your little app is the > only thing going on in the system Actually, 10 copies of the little app are the only things running except top(1) and few sleeping system services (e.g., nfsd and sshd). Apparently, you missed the "41 processes: 11 running, 30 sleeping" line above. -- Steve
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080716211317.GA92354>