From owner-freebsd-stable Thu Jul 25 16: 8:34 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2C81637B400 for ; Thu, 25 Jul 2002 16:08:30 -0700 (PDT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id A8A8943E42 for ; Thu, 25 Jul 2002 16:08:29 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.5/8.12.4) with ESMTP id g6PN8KCV035469; Thu, 25 Jul 2002 16:08:20 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.5/8.12.4/Submit) id g6PN8KrX035468; Thu, 25 Jul 2002 16:08:20 -0700 (PDT) (envelope-from dillon) Date: Thu, 25 Jul 2002 16:08:20 -0700 (PDT) From: Matthew Dillon Message-Id: <200207252308.g6PN8KrX035468@apollo.backplane.com> To: Peter Jeremy Cc: Andreas Koch , freebsd-stable@FreeBSD.ORG Subject: Re: 4.6-RC: Glacial speed of dump backups References: <20020606204948.GA4540@ultra4.eis.cs.tu-bs.de> <20020722081614.E367@gsmx07.alcatel.com.au> <20020722100408.GP26095@ultra4.eis.cs.tu-bs.de> <200207221943.g6MJhIBX054785@apollo.backplane.com> <20020725164416.A52778@gsmx07.alcatel.com.au> <200207251715.g6PHFGDD034256@apollo.backplane.com> <20020726073104.R38313@gsmx07.alcatel.com.au> Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :> dump is not likely to re-request the block. Changing the conditional :> above and setting the BLKFACTOR to 1 in my code will mimic this :> behavior. : :Actually, from memory of the statistics I gathered previously, apart :from inodes, dump only ever reads a single "block" (offset/size pair) :once. The trick is to identify when dump will read both (offset,size1) :and (offset+size1,size2) and merge it into read(offset,size1+size2) :(even though the original reads occur at different times and read into :non-adjacent buffers). A traditional cache relies on locality of :reference - and I'm not sure that UFS layout provides this when there :are lots of small files. Yes, that is correct. A careful reading will reveal the strategy.. if the cache is using 32K blocks, for example, and dump reads a 32K block, there is really no sense in caching it. If dump reads a 16K block then it might make more sense to cache the 32K because dump might read the next 16K block. Or 1K or whatever. NetBSD goes a step further and keeps track of the smaller blocks dump reads and expunges the larger block from its cache when it detects all the small blocks have been read, but I think this is rather silly. : :.. :parents were just sleeping. It looks like at least the first few :children are active which means the system thrashes fairly badly unless :there's enough RAM to keep 5 or 6 copies of the cache resident. : :I've tried repeating the 64M cache on another Proliant with 256MB RAM :and it ran to completion (though slowly). : :This suggests that unless you want to limit dump to using very small :caches, you need to share the cache between all the children (which :implies a lot more synchronisation code). : :Peter A 4M cache should be sufficient. I also could use madvise() to help 'clear' the cache when the children fork, avoiding some of the swapping mess, but the main idea is that the cache should not need to be all that large to yield a reasonable benefit. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message