From owner-freebsd-stable@FreeBSD.ORG Tue Nov 23 20:31:15 2004 Return-Path: Delivered-To: freebsd-stable@www.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DBC8616A4CE for ; Tue, 23 Nov 2004 20:31:15 +0000 (GMT) Received: from mx2.freebsd.org (mx2.freebsd.org [216.136.204.119]) by mx1.FreeBSD.org (Postfix) with ESMTP id B8C7243D2D for ; Tue, 23 Nov 2004 20:31:15 +0000 (GMT) (envelope-from ski@indymedia.org) Received: from hub.freebsd.org (hub.freebsd.org [216.136.204.18]) by mx2.freebsd.org (Postfix) with ESMTP id 7445555681 for ; Tue, 23 Nov 2004 20:31:15 +0000 (GMT) (envelope-from ski@indymedia.org) Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 59DE916A4CE for ; Tue, 23 Nov 2004 20:31:15 +0000 (GMT) Received: from deskaheh.nysindy.org (host-69-48-73-242.roc.choiceone.net [69.48.73.242]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2C44043D5D for ; Tue, 23 Nov 2004 20:31:14 +0000 (GMT) (envelope-from ski@indymedia.org) Received: from 10.0.0.42 (unknown [10.0.0.254]) by deskaheh.nysindy.org (Postfix) with ESMTP id 2ED3D41A05; Tue, 23 Nov 2004 15:31:07 -0500 (EST) Received: from 10.0.0.26 (SquirrelMail authenticated user ski); by wuhjuhbuh.afraid.org with HTTP; Tue, 23 Nov 2004 15:31:12 -0500 (EST) Message-ID: <2566.10.0.0.26.1101241872.squirrel@10.0.0.26> In-Reply-To: <41A2C5C0.3080908@yahoo.com> References: <41A2C5C0.3080908@yahoo.com> Date: Tue, 23 Nov 2004 15:31:12 -0500 (EST) From: "Brian Szymanski" To: "Rob" User-Agent: SquirrelMail/1.4.3a X-Mailer: SquirrelMail/1.4.3a MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit cc: freebsd-stable@lists.freebsd.org Subject: Re: make -j$n buildworld : use of -j investigated X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Nov 2004 20:31:16 -0000 Did you try any machines that used Hyperthreading? I'd be interested to see how those machines fare based on the number of logical and real CPUs. > Although people suggest "-j4" as optimal in general > case, I have come to a very different conclusion: > > 1) single CPU with enough RAM (2 GHz, 512 MB) > there's no significant speed up in the range > "-j1" to "-j9". > So "-j1" is as good as "-j9". If you went to all that trouble, you might as well post the numbers :-) > 2) single CPU with little RAM (333 MHz, 64 MB) > speed slows down rapidly from "-j1" to "-j9", > because of intensive swapping. > So "-j1" performs best in this case. This is expected. A note should probably be added to the handbook giving rough approximations of how much memory per simultaneous process is necessary for optimal performance. I'd guess 48MB * p + c, where c = the machine's memory load while idle and p = the number of compile processes (most don't take nearly that much memory, but c++ can gobble it) > 3) dual CPU with enough RAM (2 x 800 MHz, 1GB) > speed up by almost two from "-j1" to "-j2", > but after that no noticeable speed up anymore. > So "-j2" is as good as "-j9". Again, you went to the trouble, post the numbers? > With these simple tests, I come to the conclusion that > "make -j$n buildworld" is best with n = number of CPUs. > Does that make sense? Sort of. It depends on more than just the number of CPUs. IO speed is also very important. If you're using NFS over non-gigabit ethernet or to a slow NFS server, it's worth ratcheting the number of threads up. The same would go for old slow disks, or if you have /usr/src union-mounted from a cdrom drive, etc. Also disk layout: having /usr/src on a different drive from /usr/obj can speed up the IO-bound portions of the process a great deal by eliminating contention. If you do less waiting for IO, adding more threads has a less pronounced or even negative effect due to cpu contention instead of the positive "work while the other thread waits on IO" effect. This is the basic underlying principle, which the handbook doesn't really point out. Seems to me the pluses and minuses of increasing n are: + More chances to do work when other processes are waiting on IO. - CPU contention resulting in context switches and other wasted cycles due to extra scheduling overhead (probably negligible, maybe significant with high HZ in kernel config). - Memory contention (aka usage). It might be worth decreasing the number recommended somewhat, but I think j = ncpu is too small for a general recommendation, because unless you are memory tight there is very little harm in increasing the number. I'd suspect j = 2 * ncpu or even j = ncpu + 1 are better rules of thumb. A better formula would take average IO thruput and latency rates from bonnie++, amount of available memory, and the number and speed of cpus. A perl script that measures these numbers and determines the optimal setting is left as an excersize to the reader. Extra credit - code it in C and get it integrated in -CURRENT so that "make buildworld" automagically calls "make -j=$n real_buildworld" with the optimal value of n :-) My results, for what it's worth: Specs: Athlon XP 2500+, 512M of 333MHz DDR ram. /usr/obj is a gvinum raid0 (striped) volume of two SATA disks. /usr/src is on a gvinum raid1 (mirrored) volume of two PATA disks. options HZ=1000 in the kernel config, pretty vanilla besides that.. in make.conf: CFLAGS=-O2 -pipe -march=athlon-xp CXXFLAGS empty due to a bug with memoization last time i tried a compile... make -j1 buildworld: real 64m54.298s user 52m56.915s sys 9m13.041s make -j2 buildworld: real 67m55.816s user 56m20.778s sys 10m20.247s make -j3 buildworld: real 70m53.936s user 59m2.447s sys 10m43.325s make -j4 buildworld: real 72m25.904s user 60m19.098s sys 10m59.492s -- Brian Szymanski ski@indymedia.org