From owner-freebsd-cluster@FreeBSD.ORG Mon Apr 19 20:32:11 2004 Return-Path: Delivered-To: freebsd-cluster@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 887D416A4CE; Mon, 19 Apr 2004 20:32:11 -0700 (PDT) Received: from TRANG.nuxi.com (trang.nuxi.com [66.93.134.19]) by mx1.FreeBSD.org (Postfix) with ESMTP id 679A743D49; Mon, 19 Apr 2004 20:32:11 -0700 (PDT) (envelope-from obrien@NUXI.com) Received: from dragon.nuxi.com (obrien@localhost [127.0.0.1]) by TRANG.nuxi.com (8.12.11/8.12.10) with ESMTP id i3K3WA2t098483; Mon, 19 Apr 2004 20:32:10 -0700 (PDT) (envelope-from obrien@dragon.nuxi.com) Received: (from obrien@localhost) by dragon.nuxi.com (8.12.11/8.12.11/Submit) id i3K3W9n5098482; Mon, 19 Apr 2004 20:32:09 -0700 (PDT) (envelope-from obrien) Date: Mon, 19 Apr 2004 20:32:08 -0700 From: "David O'Brien" To: Jeffrey Racine Message-ID: <20040420033208.GB98258@dragon.nuxi.com> References: <024f01c41ffa$029327e0$0c03a8c0@internal.thebeatbox.org> <1081775064.990.13.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1081775064.990.13.camel@x1-6-00-b0-d0-c2-67-0e.twcny.rr.com> User-Agent: Mutt/1.4.1i X-Operating-System: FreeBSD 5.2-CURRENT Organization: The NUXI BSD Group X-Pgp-Rsa-Fingerprint: B7 4D 3E E9 11 39 5F A3 90 76 5D 69 58 D9 98 7A X-Pgp-Rsa-Keyid: 1024/34F9F9D5 cc: freebsd-amd64@freebsd.org cc: freebsd-cluster@freebsd.org Subject: Re: LAM MPI on dual processor opteron box sees only one cpu... X-BeenThere: freebsd-cluster@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: obrien@freebsd.org List-Id: Clustering FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Apr 2004 03:32:11 -0000 On Mon, Apr 12, 2004 at 09:04:24AM -0400, Jeffrey Racine wrote: > Hi Roland. > > I do get CPU #1 launched. This is not the problem. > > The problem appears to be with the way that current is scheduling. > > With mpirun np 2 I get the job running on CPU 0 (two instances on one > proc). However, it turns out that with np 4 I get the job running on CPU > 0 and 1 though with 4 instances (and associated overhead). Here is top > for np 4... notice that in the C column it is using both procs. > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU > COMMAND > 96090 jracine 131 0 7148K 2172K CPU1 1 0:19 44.53% 44.53% > n_lam > 96088 jracine 125 0 7148K 2172K RUN 0 0:18 43.75% 43.75% > n_lam > 96089 jracine 136 0 7148K 2172K RUN 1 0:19 42.19% 42.19% > n_lam > 96087 jracine 135 0 7188K 2248K RUN 0 0:19 41.41% 41.41% > n_lam > > > One run (once when I rebooted lam) did allocate the job correctly with > np 2, but this is not in general the case. On other systems I use, > however, they correctly farm out np 2 to CPU 0 and 1... > > Thanks, and any suggestions welcome. 1. Please don't top-post -- it looses context. This is a Unix list, not Mikeysoft one. 2. Have you tried with the 4.4BSD scheduler vs. the "ULE" scheduler? To test, replace: options SCHED_ULE # ULE scheduler with options SCHED_4BSD #4BSD scheduler -- David