Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Dec 2011 11:47:40 -0800
From:      Steve Kargl <sgk@troutmask.apl.washington.edu>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        freebsd-stable@FreeBSD.org
Subject:   Re: SCHED_ULE should not be the default
Message-ID:  <20111222194740.GA36796@troutmask.apl.washington.edu>
In-Reply-To: <4EF37E7B.4020505@FreeBSD.org>
References:  <4EE1EAFE.3070408@m5p.com> <CAJ-FndBSOS3hKYqmPnVkoMhPmowBBqy9-%2BeJJEMTdoVjdMTEdw@mail.gmail.com> <20111215215554.GA87606@troutmask.apl.washington.edu> <CAJ-FndD0vFWUnRPxz6CTR5JBaEaY3gh9y7-Dy6Gds69_aRgfpg@mail.gmail.com> <20111222005250.GA23115@troutmask.apl.washington.edu> <20111222103145.GA42457@onelab2.iet.unipi.it> <20111222184531.GA36084@troutmask.apl.washington.edu> <4EF37E7B.4020505@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Dec 22, 2011 at 09:01:15PM +0200, Andriy Gapon wrote:
> on 22/12/2011 20:45 Steve Kargl said the following:
> > I've used schedgraph to look at the ktrdump output.  A jpg is
> > available at http://troutmask.apl.washington.edu/~kargl/freebsd/ktr.jpg
> > This shows the ping-pong effect where here 3 processes appear to be
> > using 2 cpus while the remaining 2 processes are pinned to their
> > cpus.
> 
> I'd recommended enabling CPU-specific background colors via the menu in
> schedgraph for a better illustration of your findings.
> 
> NB: I still don't understand the point of purposefully running N+1 CPU-bound
> processes.
> 

The point is that this is a node in a HPC cluster with
multiple users.  Sure, I can start my job on this node
with only N cpu-bound jobs.  Now, when user John Doe
wants to run his OpenMPI program should he login into
the 12 nodes in the cluster to see if someone is already
running N cpu-bound jobs on a given node?  4BSD
gives my jobs and John Doe's jobs a fair share of the
available cpus.  ULE does not give a fair share and 
if you read the summary file I put up on the web,
you see that it is fairly non-deterministic on when a
OpenMPI run will finish (see the mean absolute deviations
in the table of 'real' times that I posted).

There is the additional observation in one of my 2008
emails (URLs have been posted) that if you have N+1
cpu-bound jobs with, say, job0 and job1 ping-ponging
on cpu0 (due to ULE's cpu-affinity feature) and if I
kill job2 running on cpu1, then neither job0 nor job1
will migrate to cpu1.  So, one now has N cpu-bound
jobs running on N-1 cpus.

Finally, my initial post in this email thread was to
tell O. Hartman to quit beating his head against 
a wall with ULE (in an HPC environment).  Switch to
4BSD.  This was based on my 2008 observations and 
I've now wasted 2 days gather additional information
which only re-affirms my recommendation.
 
-- 
Steve



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111222194740.GA36796>