From owner-freebsd-current@FreeBSD.ORG Thu Jul 7 07:13:39 2011 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DD07B106564A for ; Thu, 7 Jul 2011 07:13:39 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 2D5678FC13 for ; Thu, 7 Jul 2011 07:13:38 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA13668; Thu, 07 Jul 2011 10:13:33 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1QeimD-00093I-FG; Thu, 07 Jul 2011 10:13:33 +0300 Message-ID: <4E155C9B.6020700@FreeBSD.org> Date: Thu, 07 Jul 2011 10:13:31 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.18) Gecko/20110626 Lightning/1.0b2 Thunderbird/3.1.11 MIME-Version: 1.0 To: Steve Kargl , Poul-Henning Kamp References: <20110706170132.GA68775@troutmask.apl.washington.edu> <5080.1309971941@critter.freebsd.dk> <20110706180001.GA69157@troutmask.apl.washington.edu> In-Reply-To: <20110706180001.GA69157@troutmask.apl.washington.edu> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: FreeBSD Current , "Hartmann, O." Subject: Re: Heavy I/O blocks FreeBSD box for several seconds X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 07:13:39 -0000 on 06/07/2011 21:00 Steve Kargl said the following: > On Wed, Jul 06, 2011 at 05:05:41PM +0000, Poul-Henning Kamp wrote: >> In message <20110706170132.GA68775@troutmask.apl.washington.edu>, Steve Kargl w >> rites: >> >>> I periodically ran the same type test in the 2008 post over the >>> last three years. Nothing has changed. I even set up an account >>> on one node in my cluster for jeffr to use. He was too busy to >>> investigate at that time. >> >> Isn't this just the lemming-syncer hurling every dirty block over >> the cliff at the same time ? > > I don't know the answer. Of course, having no experience in > processing scheduling, I don't understand the question either ;-) I think that Poul-Henning was speaking in the vein of the subject line where I/O is somehow involved. I admit I would also love to hear more details in more technical terms (without lemmings and cliffs) :-) > AFAICT, it is a cpu affinity issue. If I launch n+1 MPI images > on a system with n cpus/cores, then 2 (and sometimes 3) images > are stuck on a cpu and those 2 (or 3) images ping-pong on that > cpu. I recall trying to use renice(8) to force some load > balancing, but vaguely remember that it did not help. Your issue seems to be about a specific case of purely CPU-bound loads. It is very relevant to ULE, but perhaps not to this particular thread. >> To find out: Run gstat and keep and eye on the leftmost column >> >> The road map for fixing that has been known for years... I would love to hear more about this. A link to a past discussion, if any, would suffice. -- Andriy Gapon