From owner-freebsd-stable@FreeBSD.ORG Thu Dec 22 19:47:41 2011 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 520E7106564A; Thu, 22 Dec 2011 19:47:41 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 298A78FC15; Thu, 22 Dec 2011 19:47:41 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id pBMJle9J037532; Thu, 22 Dec 2011 11:47:40 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id pBMJlefm037531; Thu, 22 Dec 2011 11:47:40 -0800 (PST) (envelope-from sgk) Date: Thu, 22 Dec 2011 11:47:40 -0800 From: Steve Kargl To: Andriy Gapon Message-ID: <20111222194740.GA36796@troutmask.apl.washington.edu> References: <4EE1EAFE.3070408@m5p.com> <20111215215554.GA87606@troutmask.apl.washington.edu> <20111222005250.GA23115@troutmask.apl.washington.edu> <20111222103145.GA42457@onelab2.iet.unipi.it> <20111222184531.GA36084@troutmask.apl.washington.edu> <4EF37E7B.4020505@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EF37E7B.4020505@FreeBSD.org> User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@FreeBSD.org Subject: Re: SCHED_ULE should not be the default X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Dec 2011 19:47:41 -0000 On Thu, Dec 22, 2011 at 09:01:15PM +0200, Andriy Gapon wrote: > on 22/12/2011 20:45 Steve Kargl said the following: > > I've used schedgraph to look at the ktrdump output. A jpg is > > available at http://troutmask.apl.washington.edu/~kargl/freebsd/ktr.jpg > > This shows the ping-pong effect where here 3 processes appear to be > > using 2 cpus while the remaining 2 processes are pinned to their > > cpus. > > I'd recommended enabling CPU-specific background colors via the menu in > schedgraph for a better illustration of your findings. > > NB: I still don't understand the point of purposefully running N+1 CPU-bound > processes. > The point is that this is a node in a HPC cluster with multiple users. Sure, I can start my job on this node with only N cpu-bound jobs. Now, when user John Doe wants to run his OpenMPI program should he login into the 12 nodes in the cluster to see if someone is already running N cpu-bound jobs on a given node? 4BSD gives my jobs and John Doe's jobs a fair share of the available cpus. ULE does not give a fair share and if you read the summary file I put up on the web, you see that it is fairly non-deterministic on when a OpenMPI run will finish (see the mean absolute deviations in the table of 'real' times that I posted). There is the additional observation in one of my 2008 emails (URLs have been posted) that if you have N+1 cpu-bound jobs with, say, job0 and job1 ping-ponging on cpu0 (due to ULE's cpu-affinity feature) and if I kill job2 running on cpu1, then neither job0 nor job1 will migrate to cpu1. So, one now has N cpu-bound jobs running on N-1 cpus. Finally, my initial post in this email thread was to tell O. Hartman to quit beating his head against a wall with ULE (in an HPC environment). Switch to 4BSD. This was based on my 2008 observations and I've now wasted 2 days gather additional information which only re-affirms my recommendation. -- Steve