From owner-freebsd-stable@FreeBSD.ORG Fri Dec 23 19:11:47 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 528A0106564A; Fri, 23 Dec 2011 19:11:47 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 0C67E8FC12; Fri, 23 Dec 2011 19:11:47 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id pBNJBkq0056394; Fri, 23 Dec 2011 11:11:46 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id pBNJBkLt056393; Fri, 23 Dec 2011 11:11:46 -0800 (PST) (envelope-from sgk) Date: Fri, 23 Dec 2011 11:11:46 -0800 From: Steve Kargl To: Adrian Chadd Message-ID: <20111223191146.GA56232@troutmask.apl.washington.edu> References: <4EE1EAFE.3070408@m5p.com> <20111215215554.GA87606@troutmask.apl.washington.edu> <20111222005250.GA23115@troutmask.apl.washington.edu> <20111222103145.GA42457@onelab2.iet.unipi.it> <20111222184531.GA36084@troutmask.apl.washington.edu> <4EF37E7B.4020505@FreeBSD.org> <20111222194740.GA36796@troutmask.apl.washington.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org, Andriy Gapon Subject: Re: SCHED_ULE should not be the default X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Dec 2011 19:11:47 -0000 On Thu, Dec 22, 2011 at 04:23:29PM -0800, Adrian Chadd wrote: > On 22 December 2011 11:47, Steve Kargl wrote: > > > There is the additional observation in one of my 2008 > > emails (URLs have been posted) that if you have N+1 > > cpu-bound jobs with, say, job0 and job1 ping-ponging > > on cpu0 (due to ULE's cpu-affinity feature) and if I > > kill job2 running on cpu1, then neither job0 nor job1 > > will migrate to cpu1. ?So, one now has N cpu-bound > > jobs running on N-1 cpus. > > .. and this sounds like a pretty serious regression. Have you ever > filed a PR for it? > Ah, so goods news! I cannot reproduce this problem that I saw 3+ years ago on the 4-cpu node, which is currently running a ULE kernel. When I killed the (N+1)th job, the N remaining jobs are spread across the N cpus. One difference between the 2008 tests and today tests is the number of available cpus. In 2008, I ran the tests on a node with 8 cpus, while today's test used only a node with only 4 cpus. If this behavior is a scaling issue, I can't currently test it. But, today's tests are certainly encouraging. -- Steve