From owner-freebsd-current@FreeBSD.ORG Thu Jul 7 15:54:52 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B80D8106564A; Thu, 7 Jul 2011 15:54:52 +0000 (UTC) (envelope-from dudu@dudu.ro) Received: from mail-qy0-f182.google.com (mail-qy0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id 47FDB8FC15; Thu, 7 Jul 2011 15:54:52 +0000 (UTC) Received: by qyk38 with SMTP id 38so728307qyk.13 for ; Thu, 07 Jul 2011 08:54:51 -0700 (PDT) Received: by 10.224.33.82 with SMTP id g18mr744333qad.105.1310052237133; Thu, 07 Jul 2011 08:23:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.90.195 with HTTP; Thu, 7 Jul 2011 08:23:17 -0700 (PDT) In-Reply-To: <20110707151440.GA75537@troutmask.apl.washington.edu> References: <20110706170132.GA68775@troutmask.apl.washington.edu> <5080.1309971941@critter.freebsd.dk> <20110706180001.GA69157@troutmask.apl.washington.edu> <4E14A54A.4050106@freebsd.org> <4E155FF9.5090905@FreeBSD.org> <20110707151440.GA75537@troutmask.apl.washington.edu> From: Vlad Galu Date: Thu, 7 Jul 2011 17:23:17 +0200 Message-ID: To: Steve Kargl Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: FreeBSD Current , "Hartmann, O." , Nathan Whitehorn , Andriy Gapon Subject: Re: Heavy I/O blocks FreeBSD box for several seconds X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 15:54:52 -0000 On Thu, Jul 7, 2011 at 5:14 PM, Steve Kargl < sgk@troutmask.apl.washington.edu> wrote: > On Thu, Jul 07, 2011 at 10:27:53AM +0300, Andriy Gapon wrote: > > on 06/07/2011 21:11 Nathan Whitehorn said the following: > > > On 07/06/11 13:00, Steve Kargl wrote: > > >> AFAICT, it is a cpu affinity issue. If I launch n+1 MPI images > > >> on a system with n cpus/cores, then 2 (and sometimes 3) images > > >> are stuck on a cpu and those 2 (or 3) images ping-pong on that > > >> cpu. I recall trying to use renice(8) to force some load > > >> balancing, but vaguely remember that it did not help. > > > > > > I've seen exactly this problem with multi-threaded math libraries, as > well. > > > > Exactly the same? Let's see. > > > > > Using parallel GotoBLAS on FreeBSD gives terrible performance because > the > > > threads keep migrating between CPUs, causing frequent cache misses. > > > > So Steve reports that if he has Nthr > Ncpu, then some threads are > "over-glued" > > to a particular CPU, which results in sub-optimal scheduling for those > threads. > > I have to guess that Steve would want to see the threads being shuffled > between > > CPUs to produce more even CPU load. > > I'm using OpenMPI. These are N > Ncpu processes not threads, and without > the loss of generality let N = Ncpu + 1. It is a classic master-slave > situation where 1 process initializes all others. The n-1 slave processes > are then independent of each other. After 20 minutes or so of number > crunching, each slave sends a few 10s of KB of data to the master. The > master collects all the data, writes it to disk, and then sends the > slaves the next set of computations to do. The computations are nearly > identical, so each slave finishes it task in the same amount of time. The > problem appears to be that 2 slaves are bound to the same cpu and the > remaining N - 3 slaves are bound to a specific cpu. The N - 3 slaves > finish their task, send data to the master, and then spin (chewing up > nearly 100% cpu) waiting for the 2 ping-ponging slaves to finishes. > This causes a stall in the computation. When a complete computation > takes days to complete, theses stall become problematic. So, yes, I > want the processes to get a more uniform access to cpus via migration > to other cpus. This is what 4BSD appears to do. > > Spinning threads are a PITA for any scheduler, it's just that in your case 4BSD computes quantums differently. Is there any way to make the software sleep instead of spinning? > > On the other hand, you report that your threads keep being shuffled > between CPUs > > (I presume for Nthr == Ncpu case, where Nthr is a count of the > number-crunching > > threads). And I guess that you want them to stay glued to particular > CPUs. > > > > So how is this the same problem? In fact, it sounds like somewhat > opposite. > > The only thing in common is that you both don't like how ULE works. > > Well, it may be similar in that N - 2 threads are bound to N - 2 > cpus, and the remaining 2 threads are ping ponging on the last > remaining cpu. I suspect that GotoBLAS has a large amount > communication between threads, and once again the computations > stalls waiting of the 2 threads to either finish battling for the > 1 cpu or perhaps the process uses pthread_yield() in some clever > way to try to get load balancing. > > -- > Steve > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > -- Good, fast & cheap. Pick any two.