Date: Tue, 24 May 2005 07:20:49 -0400 From: "Peter C. Lai" <sirmoo@cowbert.2y.net> To: freebsd-smp@freebsd.org Subject: 5.4-R ULE peformance and MPI Message-ID: <20050524112049.GF608@cowbert.2y.net>
next in thread | raw e-mail | index | archive | help
On 5.4-R, the 4BSD scheduler appears to be much faster than the ULE scheduler, all else being equal, when an application is being parallelized by MPI. The hardward is dual pentium3. The particular application that we are using to benchmark this is the science/gromacs molecular dynamics simulation port, custom built to work with the single precision floating point configuration of math/fftw and net/mpich ports. None of these are threadsafe so we do not link to a threading library. In repeated runs of the same initial conditions of a particular test simulation, gromacs reports about 350mflops calculated while on ULE and 700mflops on 4BSD and the total cpu time used is ~230s on ULE and 118s on 4BSD. Using top(1), we notice that under ULE, the two gromacs processes are unable to fully use the two cpus because the IPC causes them to both request the same cpu about half the time (and therefore each process runs at 50% all the time), I am guessing ULE is spin locking so that one process is effectively blocking the other. I don't have time(1) data to show the context switching though. My cursory googling didn't turn up anything related to this so I was wondering if you already know about this issue. Thanks peter -- Peter C. Lai University of Connecticut Dept. of Molecular and Cell Biology Yale University School of Medicine SenseLab | Research Assistant http://cowbert.2y.net/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050524112049.GF608>