From owner-freebsd-questions@FreeBSD.ORG Wed Jul 6 18:00:02 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3722E106564A; Wed, 6 Jul 2011 18:00:02 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 174C98FC1F; Wed, 6 Jul 2011 18:00:02 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.4/8.14.4) with ESMTP id p66I01CY069240; Wed, 6 Jul 2011 11:00:01 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.4/8.14.4/Submit) id p66I01sb069239; Wed, 6 Jul 2011 11:00:01 -0700 (PDT) (envelope-from sgk) Date: Wed, 6 Jul 2011 11:00:01 -0700 From: Steve Kargl To: Poul-Henning Kamp Message-ID: <20110706180001.GA69157@troutmask.apl.washington.edu> References: <20110706170132.GA68775@troutmask.apl.washington.edu> <5080.1309971941@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5080.1309971941@critter.freebsd.dk> User-Agent: Mutt/1.4.2.3i Cc: FreeBSD Current , "Hartmann, O." , arrowdodger <6yearold@gmail.com>, freebsd-questions@freebsd.org Subject: Re: Heavy I/O blocks FreeBSD box for several seconds X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 18:00:02 -0000 On Wed, Jul 06, 2011 at 05:05:41PM +0000, Poul-Henning Kamp wrote: > In message <20110706170132.GA68775@troutmask.apl.washington.edu>, Steve Kargl w > rites: > > >I periodically ran the same type test in the 2008 post over the > >last three years. Nothing has changed. I even set up an account > >on one node in my cluster for jeffr to use. He was too busy to > >investigate at that time. > > Isn't this just the lemming-syncer hurling every dirty block over > the cliff at the same time ? I don't know the answer. Of course, having no experience in processing scheduling, I don't understand the question either ;-) AFAICT, it is a cpu affinity issue. If I launch n+1 MPI images on a system with n cpus/cores, then 2 (and sometimes 3) images are stuck on a cpu and those 2 (or 3) images ping-pong on that cpu. I recall trying to use renice(8) to force some load balancing, but vaguely remember that it did not help. > To find out: Run gstat and keep and eye on the leftmost column > > The road map for fixing that has been known for years... I'll keep this in mind, the next time I upgrade the cluster. It's currently running a Feb 10th vintage kernel, and is under fairly heavy use at the moment. -- Steve