Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Dec 2007 09:31:17 -0700
From:      Brett Glass <brett@lariat.net>
To:        "Adrian Chadd" <adrian@freebsd.org>
Cc:        stable@freebsd.org
Subject:   Re: SMP on FreeBSD 6.x and 7.0: Worth doing?
Message-ID:  <200712261631.JAA28890@lariat.net>
In-Reply-To: <d763ac660712260732r9a9b86cud13417108e4657e7@mail.gmail.com >
References:  <200712220531.WAA09277@lariat.net> <476FBED0.2080400@samsco.org> <200712241549.IAA19650@lariat.net> <476FDA10.4060107@samsco.org> <200712241653.JAA20845@lariat.net> <476FE868.8080704@samsco.org> <200712241756.KAA21950@lariat.net> <d763ac660712241820s38237d99x1243862095780dc6@mail.gmail.com> <4772529D.9010805@samsco.org> <200712261512.IAA27697@lariat.net> <d763ac660712260732r9a9b86cud13417108e4657e7@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
At 08:32 AM 12/26/2007, Adrian Chadd wrote:
 
>The biggest bonuses to gain high throughput with web caches, at least
>with small objects, is to apply temporal locality to them and do IO in
>$LARGE chunks.

By "temporal locality" I assume you mean that you expect items that
are fetched at the same time to both be fetched the next time 
also. Sort of a "working set" concept for Web pages. Correct?

>You then want to pull tricks with your memory cache so you throw away
>RAM in cluster-sized chunks - the objects grouped by temporal locality
>above - because really, if you throw away a 4k page, your costs of
>performing disk IO to read that 4k versus reading say, 32k or 64k, are
>pretty freaking similar (the same if you happen to be on the same
>track, slightly more if you're straddling tracks.) So you also want to
>pull those tricks. If you have two small objects (<64k) which are 50%
>likely to be fetched together, then by grouping them into one IO
>operation you're effectively slicing the seeks needed in half with
>very little impact. Well, there's an impact - you suddenly start
>pulling lots more data off disk.

And you need more buffer space. The key, I think, is to avoid needing
that buffer space on multiple levels. The file system may prefetch
large chunks and then the Web cache might do so also, doubling the
overhead.

>Could -that- be done without too much trouble? I've looked at
>madvise() to pull these tricks with mmap()'ed backing files but,
>again, I've not hit the point where I'm looking to optimise Squid's
>disk storage. There's just too much catching up to do to varnish's
>memory-only workload performance. Damn you phk. :)

I don't know much about Varnish, but I'd been told that it is not
a replacement for Squid.

In any event, I certainly WOULD like to see a cache that had a true
first-level cache in memory and a second-level cache on disk. The way 
Squid works now, it never keeps copies of objects in RAM once they've 
been evicted -- a major flaw, IMHO. This may account for the performance
disadvantage relative to Varnish. After all, some objects are accessed
by many people at certain times of day, and it pays to "promote" them
to RAM during those periods.

--Brett 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200712261631.JAA28890>