From owner-freebsd-performance@FreeBSD.ORG Fri Feb 8 15:18:01 2008 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E15B316A41B; Fri, 8 Feb 2008 15:18:01 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (cl-162.ewr-01.us.sixxs.net [IPv6:2001:4830:1200:a1::2]) by mx1.freebsd.org (Postfix) with ESMTP id 61C2213C442; Fri, 8 Feb 2008 15:18:01 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (localhost [127.0.0.1]) by lor.one-eyed-alien.net (8.14.1/8.13.8) with ESMTP id m18FHv1p043622; Fri, 8 Feb 2008 09:17:57 -0600 (CST) (envelope-from brooks@lor.one-eyed-alien.net) Received: (from brooks@localhost) by lor.one-eyed-alien.net (8.14.1/8.13.8/Submit) id m18FHvL1043621; Fri, 8 Feb 2008 09:17:57 -0600 (CST) (envelope-from brooks) Date: Fri, 8 Feb 2008 09:17:56 -0600 From: Brooks Davis To: Erik Cederstrand Message-ID: <20080208151756.GA35423@lor.one-eyed-alien.net> References: <4796C717.9000507@cederstrand.dk> <20080123193400.N63024@fledge.watson.org> <4797A245.7080202@cederstrand.dk> <20080123202433.E63024@fledge.watson.org> <4797A802.8060509@FreeBSD.org> <47A0BFE7.4070708@cederstrand.dk> <20080130190000.GA18333@lor.one-eyed-alien.net> <47AC15A5.5020009@cederstrand.dk> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="n8g4imXOkfNTN/H1" Content-Disposition: inline In-Reply-To: <47AC15A5.5020009@cederstrand.dk> User-Agent: Mutt/1.5.16 (2007-06-09) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (lor.one-eyed-alien.net [127.0.0.1]); Fri, 08 Feb 2008 09:17:58 -0600 (CST) Cc: freebsd-performance@freebsd.org, Brooks Davis , kris@freebsd.org Subject: Re: Performance Tracker project update X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2008 15:18:02 -0000 --n8g4imXOkfNTN/H1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Feb 08, 2008 at 09:41:09AM +0100, Erik Cederstrand wrote: > Brooks Davis skrev: >> On Wed, Jan 30, 2008 at 07:20:23PM +0100, Erik Cederstrand wrote: >>>=20 >>> I'd like a situation where I can very quickly set up a slave with a=20 >>> specific version of FreeBSD to run additional tests or provide shell=20 >>> access to a developer. This currently involves adding an entry to a=20 >>> queue, rebooting and waiting 2 minutes. Quick and easy, but the archivi= ng=20 >>> strategy is obviously very inefficient. >>>=20 >>> I'm thinking of a couple of options: >>> 1. Having one full install per month and archiving the rest as diffs >>> against that by recursively bsdiff'ing every file in the tree (I >>> could bsdiff a whole tarball, but bsdiff is very memory-intensive). >>> Quick test: 25 mins. >>> 2. Make a hash of all files and only store the binaries where the hash >>> is different from the monthly tarball. Faster than 1., but less >>> effective. Quick test: 5 mins. >>> 3. Use some kind of VCS. My experience with Subversion and binary files >>> is that it's very slow. >>> 4. Throw hardware at the problem. >>>=20 >>> I'd say it should not take more than 10 mins to recreate an archived=20 >>> version. Any thoughts? >> It seems like you should be able to combine 1 and 2 with checksums to >> decide if you need to run diffs. I'd think that would be quite fast. >=20 > I finally got around to testing this, and with a combination of mtree=20 > comparing md5 hashes, bsdiff compacting changed files and hardlinking=20 > unchanged files I get a reduction in size from 256MB to 10MB. Pretty good= ,=20 > and the whole operation only takes a few minutes. Cool! > I have one peculiarity, though. I install python2.5 into the directory=20 > containing the build, and even though the python version has not changed,= I=20 > still get mismatching md5 sums on every .pyo and .pyc file. Any thoughts = on=20 > this? I'm not a python guru by any means, but I think .pyc files probably have da= ta about the .py they are generated from because there's some sort of auto-generation available. It may be possible to not store them at all and just generate them before you use them or add some magic build flags to cau= se them to store some sort of cooked values. I'm not sure where the .pyo files come from. -- Brooks --n8g4imXOkfNTN/H1 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFHrHKkXY6L6fI4GtQRApHaAJ97Xs/RkROLfXsgnFBV8d6yHmfoCQCgtF9N P5wzW2mvgZCgBv973JH1cMs= =Fzh9 -----END PGP SIGNATURE----- --n8g4imXOkfNTN/H1--