From owner-freebsd-hackers@FreeBSD.ORG Sun Feb 18 00:28:01 2007 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 48ED816A401 for ; Sun, 18 Feb 2007 00:28:01 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (c220-239-3-125.belrs4.nsw.optusnet.com.au [220.239.3.125]) by mx1.freebsd.org (Postfix) with ESMTP id AF06913C461 for ; Sun, 18 Feb 2007 00:28:00 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by turion.vk2pj.dyndns.org (8.13.8/8.13.8) with ESMTP id l1I0RxBl009047 for ; Sun, 18 Feb 2007 11:27:59 +1100 (EST) (envelope-from peter@turion.vk2pj.dyndns.org) Received: (from peter@localhost) by turion.vk2pj.dyndns.org (8.13.8/8.13.8/Submit) id l1I0RxGF009046 for freebsd-hackers@freebsd.org; Sun, 18 Feb 2007 11:27:59 +1100 (EST) (envelope-from peter) Date: Sun, 18 Feb 2007 11:27:58 +1100 From: Peter Jeremy To: freebsd-hackers@freebsd.org Message-ID: <20070218002758.GQ859@turion.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="8nsIa27JVQLqB7/C" Content-Disposition: inline X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.13 (2006-08-11) Subject: Abyssmal dump cache efficiency X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Feb 2007 00:28:01 -0000 --8nsIa27JVQLqB7/C Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I've been looking into the efficiency of the caching in dump(8) and it's abyssmal: After instrumenting dump(8) and looking at its access patterns, it turns out that it typically reads roughly three times as much data into its cache as it should, whilst using five times as much RAM as requested. This poor efficiency is related to the way dump(8) works: A master process scans through the inodes to dump and a number of slave processes (three by default) actually read the disk blocks and write the data to tape. Additional processes are spawned for each tape written (to simplify checkpointing). If dump writes to a single tape, it will use a total of five processes. The current cache mechanism has a separate private cache associated with each process. Thus you typically have five caches (each of the requested size). The re-reading is partially caused by the distribution of read requests across the slave processes: Read requests for adjacent blocks of data are likely to be handled by different slave processes and therefore different caches. I've checked out the behaviour in the various *BSDs with the following results: DragonFly copied FreeBSD NetBSD uses a single shared cache OpenBSD doesn't support caching. I've tried modelling a unified cache along the NetBSD line and there appears to be a massive improvement in cache performance. It's unclear how much of an improvement this will give in overall performance but not physically reading data from disk must be faster than reading it. I believe it would be worthwhile creating a todo item to investigate this more thoroughly. --=20 Peter Jeremy --8nsIa27JVQLqB7/C Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFF152O/opHv/APuIcRAm22AKC4y1LMm/PShzpKgvTx+s18KqHsNACgr/0L cUSBNfYBpC+oPqaIMK+2H+4= =EN4f -----END PGP SIGNATURE----- --8nsIa27JVQLqB7/C--