From owner-freebsd-performance@FreeBSD.ORG Wed Jun 25 19:49:58 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2B64037B401 for ; Wed, 25 Jun 2003 19:49:58 -0700 (PDT) Received: from stoneport.math.uic.edu (stoneport.math.uic.edu [131.193.178.160]) by mx1.FreeBSD.org (Postfix) with SMTP id 709D44400D for ; Wed, 25 Jun 2003 19:49:57 -0700 (PDT) (envelope-from djb-dsn-1056595829.71392@cr.yp.to) Received: (qmail 71393 invoked by uid 1017); 26 Jun 2003 02:50:29 -0000 Date: 26 Jun 2003 02:50:29 -0000 Message-ID: <20030626025029.71392.qmail@cr.yp.to> Automatic-Legal-Notices: See http://cr.yp.to/mailcopyright.html. From: "D. J. Bernstein" To: freebsd-performance@freebsd.org References: <20030625060629.51087.qmail@cr.yp.to> <20030625023621.N17881-100000@mail.chesapeake.net> <20030625094301.56349.qmail@cr.yp.to> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Subject: Re: ten thousand small processes X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jun 2003 02:49:58 -0000 Jon Mini writes: > I'm sorry, but you are way off here. First of all, caches are *much > larger* than the size of the processes you are talking about. I'm sorry, but you are being misled by a naive model of CPU performance. On a typical Pentium in our department, the following program becomes three times faster when SPACING is changed from 4096 to 128: #define SPACING 4096 char data[8 * SPACING]; main() { int i; for (i = 0;i < 10000000;++i) { data[0] = data[SPACING]; data[2 * SPACING] = data[3 * SPACING]; data[4 * SPACING] = data[5 * SPACING]; data[6 * SPACING] = data[7 * SPACING]; } } >From an asm programmer's perspective, when FreeBSD decides to spread a small program's variables between * the beginning of a data page, * the beginning of a bss page, * the beginning of a malloc mmap page, * the beginning of a heap page, * the beginning of the next heap page, * the beginning of yet another heap page, et cetera, it is actively trying (with varying degrees of success) to damage cache performance in exactly the same way that this program does. ---D. J. Bernstein, Associate Professor, Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago