From owner-freebsd-stable@FreeBSD.ORG Wed Dec 26 16:31:25 2007 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 475BF16A41B; Wed, 26 Dec 2007 16:31:25 +0000 (UTC) (envelope-from brett@lariat.net) Received: from lariat.net (lariat.net [66.119.58.2]) by mx1.freebsd.org (Postfix) with ESMTP id C7E1113C4E3; Wed, 26 Dec 2007 16:31:24 +0000 (UTC) (envelope-from brett@lariat.net) Received: from anne-o1dpaayth1.lariat.org (IDENT:ppp1000.lariat.net@lariat.net [66.119.58.2]) by lariat.net (8.9.3/8.9.3) with ESMTP id JAA28890; Wed, 26 Dec 2007 09:31:22 -0700 (MST) Message-Id: <200712261631.JAA28890@lariat.net> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Wed, 26 Dec 2007 09:31:17 -0700 To: "Adrian Chadd" From: Brett Glass In-Reply-To: References: <200712220531.WAA09277@lariat.net> <476FBED0.2080400@samsco.org> <200712241549.IAA19650@lariat.net> <476FDA10.4060107@samsco.org> <200712241653.JAA20845@lariat.net> <476FE868.8080704@samsco.org> <200712241756.KAA21950@lariat.net> <4772529D.9010805@samsco.org> <200712261512.IAA27697@lariat.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: stable@freebsd.org Subject: Re: SMP on FreeBSD 6.x and 7.0: Worth doing? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Dec 2007 16:31:25 -0000 At 08:32 AM 12/26/2007, Adrian Chadd wrote: >The biggest bonuses to gain high throughput with web caches, at least >with small objects, is to apply temporal locality to them and do IO in >$LARGE chunks. By "temporal locality" I assume you mean that you expect items that are fetched at the same time to both be fetched the next time also. Sort of a "working set" concept for Web pages. Correct? >You then want to pull tricks with your memory cache so you throw away >RAM in cluster-sized chunks - the objects grouped by temporal locality >above - because really, if you throw away a 4k page, your costs of >performing disk IO to read that 4k versus reading say, 32k or 64k, are >pretty freaking similar (the same if you happen to be on the same >track, slightly more if you're straddling tracks.) So you also want to >pull those tricks. If you have two small objects (<64k) which are 50% >likely to be fetched together, then by grouping them into one IO >operation you're effectively slicing the seeks needed in half with >very little impact. Well, there's an impact - you suddenly start >pulling lots more data off disk. And you need more buffer space. The key, I think, is to avoid needing that buffer space on multiple levels. The file system may prefetch large chunks and then the Web cache might do so also, doubling the overhead. >Could -that- be done without too much trouble? I've looked at >madvise() to pull these tricks with mmap()'ed backing files but, >again, I've not hit the point where I'm looking to optimise Squid's >disk storage. There's just too much catching up to do to varnish's >memory-only workload performance. Damn you phk. :) I don't know much about Varnish, but I'd been told that it is not a replacement for Squid. In any event, I certainly WOULD like to see a cache that had a true first-level cache in memory and a second-level cache on disk. The way Squid works now, it never keeps copies of objects in RAM once they've been evicted -- a major flaw, IMHO. This may account for the performance disadvantage relative to Varnish. After all, some objects are accessed by many people at certain times of day, and it pays to "promote" them to RAM during those periods. --Brett