From owner-freebsd-current@FreeBSD.ORG Mon Jan 26 08:53:27 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7D2AA16A4D0 for ; Mon, 26 Jan 2004 08:53:27 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 300C243D46 for ; Mon, 26 Jan 2004 08:53:22 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.10/8.12.10) with ESMTP id i0QGrLuO067154 for ; Mon, 26 Jan 2004 17:53:21 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: current@freebsd.org From: Poul-Henning Kamp Date: Mon, 26 Jan 2004 17:53:21 +0100 Message-ID: <67153.1075136001@critter.freebsd.dk> Subject: Hints for precision benchmarking... X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jan 2004 16:53:27 -0000 A number of people have started to benchmark things seriously now, and run into the usual problem of noisy data preventing any conclusions. Rather than repeat myself many times, I decided to send this email. I experiemented with micro-benchmarking some years back, here are some bullet points with a lot of the stuff I found out. You will not be able to use them all every single time, but the more you use, the better your ability to test small differences will be. * Disable APM and any other kind of clock fiddling (ACPI ?) * Run in single user mode. Cron(8) and and other daemons only add noise. * If syslog events are generated, run syslogd with an empty syslogd.conf, otherwise, do not run it. * Minimize disk-I/O, avoid it entirely if you can. * Don't mount filesystems you do not need. * Mount / and /usr and any other filesystem possible as read-only. This removes atime updates to disk (etc.) from your I/O picture. * Newfs your R/W test filesystem and populate it from a tar or dump file before every run. Unmount and mount it before starting the test. This results in a consistent filesystem layout. For a worldstone test this would apply to /usr/obj (just newfs and mount). If you want 100% reproducibility, populate your filesystem from a dd(1) file (ie: dd if=myimage of=/dev/ad0s1h bs=1m) * Use malloc backed or preloaded MD(4) partitions. * Reboot between individual iterations of your test, this gives a more consistent state. * Remove all non-essential device drivers from the kernel. For instance If you don't need USB for the test, don't put USB in the kernel. Drivers which attach often have timeouts ticking away. * Unconfigure hardware you don't use. Detach disk with atacontrol and camcontrol if you do not use them for the test. * Do not configure the network unless you are testing it (or after your test to ship the results off to another computer.) * Do not run NTPD. * Put each filesystem on its own disk. This minimizes jitter from head-seek optimizations. * Minimize output to serial or VGA consoles. Running output into files gives less jitter. (Serial consoles easily become a bottleneck). Do not touch keyboard while test is running, even shows up in your numbers. * Make sure your test is long enough, but not too long. If you test is too short, timestamping is a problem. If it is too long temperature changes and drift will affect the frequency of the quartz crystals in your computer. Rule of thumb: more than a minute, less than an hour. * Try to keep the temperature as stable as possible around the machine. This affects both quartz crystals and disk drive algorithms. If you really want to get nasty, consider stabilized clock injection. (get a OCXO + PLL, inject output into clock circuits instead of motherboard xtal. Send me an email). * Run at least 3 but better is >20 for both "before" and "after" code. Try to interleave if possible (ie: do no run 20xbefore then 20xafter), this makes it possible to spot environmental effects. Do not interleave 1:1, but 3:3, this makes it possible to spot interaction effects. My preferred pattern: bababa{bbbaaa}* This gives hint after the first 1+1 runs (so you can stop it if it goes entirely the wrong way), a stddev after the first 3+3 (gives a good indication if it is going to be worth a long run) and trending and interaction numbers later on. * Use usr/src/tools/tools/ministat to see if your numbers are significant. Consider buying "Cartoon guide to statistics" ISBN: 0062731025, highly recommended, if you've forgotton or never learned about stddev and Student's T. Enjoy, and please share any other tricks you might develop! Poul-Henning -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.