From owner-cvs-sys Sun Nov 17 04:46:28 1996 Return-Path: owner-cvs-sys Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id EAA17611 for cvs-sys-outgoing; Sun, 17 Nov 1996 04:46:28 -0800 (PST) Received: from godzilla.zeta.org.au (godzilla.zeta.org.au [203.2.228.19]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id EAA17602; Sun, 17 Nov 1996 04:46:21 -0800 (PST) Received: (from bde@localhost) by godzilla.zeta.org.au (8.7.6/8.6.9) id XAA17549; Sun, 17 Nov 1996 23:41:10 +1100 Date: Sun, 17 Nov 1996 23:41:10 +1100 From: Bruce Evans Message-Id: <199611171241.XAA17549@godzilla.zeta.org.au> To: cvs-all@freefall.freebsd.org, CVS-committers@freefall.freebsd.org, cvs-sys@freefall.freebsd.org, dyson@freefall.freebsd.org Subject: Re: cvs commit: src/sys/vm vm_kern.c vm_page.c Sender: owner-cvs-sys@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > Modified: sys/vm vm_kern.c vm_page.c > Log: > Improve the locality of reference for variables in vm_page and > vm_kern by moving them from .bss to .data. With this change, > there is a measurable perf improvement in fork/exec. This can be done without cluttering the sources by compiling with -fno-common. This causes mixtures of global and static variables to be laid out in the order of definition. Also, modules begin on 16-byte boundaries, so groups of up to 4 32-bit variables can be laid out so that they are in the same (i486 and Pentium) cache line. Unfortunately: 1. modules don't begin on 32-bit boundaries on Pentiums, so groups of 8 32-bit variables can't be guaranteed to be in the same cache line. 2. gcc only aligns variables to 32-bit boundaries, so long long alignment is wrong about half the time except for data laid out by hand. 3. most sources aren't written with cache lines in mind. 4. -fno-common puts lots of little-used variables (mostly inside large arrays and structs?) together with often-used variables. You can certainly do better by laying out individual modules by hand and depending on little-used bloat being put in the bss. The optimization probably has to cross module boundaries. E.g., `cnt' is now defined in vm_meter.c where it is little used. All the vm variables (including the machine-dependent ones) should probably be allocated contiguously. Bruce