From owner-freebsd-cluster@FreeBSD.ORG Wed Apr 21 04:50:27 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 91EA516A4CE; Wed, 21 Apr 2004 04:50:27 -0700 (PDT) Received: from ms-smtp-04.nyroc.rr.com (ms-smtp-04.nyroc.rr.com [24.24.2.58]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2696D43D3F; Wed, 21 Apr 2004 04:50:27 -0700 (PDT) (envelope-from jracine@maxwell.syr.edu) Received: from [24.59.145.52] (syr-24-59-145-52.twcny.rr.com [24.59.145.52]) i3LBoNMY003602; Wed, 21 Apr 2004 07:50:24 -0400 (EDT) From: Jeffrey Racine To: obrien@freebsd.org In-Reply-To: <20040420033208.GB98258@dragon.nuxi.com> References: <024f01c41ffa$029327e0$0c03a8c0@internal.thebeatbox.org> <1081775064.990.13.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> <20040420033208.GB98258@dragon.nuxi.com> Content-Type: text/plain Organization: Syracuse University Message-Id: <1082548217.31726.1.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Wed, 21 Apr 2004 07:50:18 -0400 Content-Transfer-Encoding: 7bit X-Virus-Scanned: Symantec AntiVirus Scan Engine cc: freebsd-amd64@freebsd.org cc: freebsd-cluster@freebsd.org Subject: Re: LAM MPI on dual processor opteron box sees only one cpu... X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Apr 2004 11:50:27 -0000 Hi David. It runs as fine with the 4BSD scheduler and distributes the load evenly... here is top with 4BSD doing the scheduling... PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND 6105 jracine 107 0 5616K 1968K CPU0 0 0:06 95.70% 21.19% n_lam 6104 jracine 107 0 5632K 2012K RUN 1 0:06 95.48% 21.14% n_lam Thanks ever so much for your kind response. -- Jeff On Mon, 2004-04-19 at 23:32, David O'Brien wrote: > On Mon, Apr 12, 2004 at 09:04:24AM -0400, Jeffrey Racine wrote: > > Hi Roland. > > > > I do get CPU #1 launched. This is not the problem. > > > > The problem appears to be with the way that current is scheduling. > > > > With mpirun np 2 I get the job running on CPU 0 (two instances on one > > proc). However, it turns out that with np 4 I get the job running on CPU > > 0 and 1 though with 4 instances (and associated overhead). Here is top > > for np 4... notice that in the C column it is using both procs. > > > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU > > COMMAND > > 96090 jracine 131 0 7148K 2172K CPU1 1 0:19 44.53% 44.53% > > n_lam > > 96088 jracine 125 0 7148K 2172K RUN 0 0:18 43.75% 43.75% > > n_lam > > 96089 jracine 136 0 7148K 2172K RUN 1 0:19 42.19% 42.19% > > n_lam > > 96087 jracine 135 0 7188K 2248K RUN 0 0:19 41.41% 41.41% > > n_lam > > > > > > One run (once when I rebooted lam) did allocate the job correctly with > > np 2, but this is not in general the case. On other systems I use, > > however, they correctly farm out np 2 to CPU 0 and 1... > > > > Thanks, and any suggestions welcome. > > 1. Please don't top-post -- it looses context. This is a Unix list, not > Mikeysoft one. > > 2. Have you tried with the 4.4BSD scheduler vs. the "ULE" scheduler? > To test, replace: > options SCHED_ULE # ULE scheduler > with > options SCHED_4BSD #4BSD scheduler > > -- David