From owner-freebsd-hackers  Thu Jan 30 17:29: 3 2003
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2FB7F37B401
	for <hackers@FreeBSD.ORG>; Thu, 30 Jan 2003 17:29:02 -0800 (PST)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id AA98B43F43
	for <hackers@FreeBSD.ORG>; Thu, 30 Jan 2003 17:29:01 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0V1Sx0i091269;
	Thu, 30 Jan 2003 17:28:59 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0V1Sxo5091268;
	Thu, 30 Jan 2003 17:28:59 -0800 (PST)
Date: Thu, 30 Jan 2003 17:28:59 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200301310128.h0V1Sxo5091268@apollo.backplane.com>
To: "Brian T. Schellenberger" <bschellenberger@nc.rr.com>
Cc: kientzle@acm.org, Sean Hamilton <sh@bel.bc.ca>,
	hackers@FreeBSD.ORG
Subject: Re: Random disk cache expiry
References: <000501c2c4dd$f43ed450$16e306cf@slugabed.org> <200301302222.h0UMMfFI090349@apollo.backplane.com> <3E39BE22.8050207@acm.org> <200301301937.12407.bschellenberger@nc.rr.com>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG


:I think you missed Matt's point, which is well-taken:
:
:Even if everybody accesses it sequentially, if you have 100 processes 
:accessing it sequentially at the *same* time, then it would be to your 
:benefit to leave the "old" pages around because even though *this* 
:process won't access it again, the *next* process very well might, if 
:it just happens to be reading it sequentially as well but is a little 
:further behind on its sequential read.

    Right, and if the same 100 processes are accessing N files sequentially
    instead of just one, you get a different effect... you might blow out
    your cache if the aggregate size of the N files is too large.  But then
    if some of those processes are accessing the same file, and other processes
    were accessing different files, you might want to cache that file
    but possibly not cache others, even though all the files are (for this
    example) the same size.  But then what if some of the processes accessing
    some of those other files were from slow clients?  You could get away
    with not caching those files and then you might possibly be able cache
    all the remaining files (being accessed by faster clients). 

    And so on, and so forth.  It gets even more complicated when you throw
    in read verses write verses read-modify-write accesses, and even more
    complicated when you add other load factors (e.g. the sheer number of
    connections might reduce the memory available for caching on the fly,
    or trying to balance executable pages verses data pages to optimize
    paging and program performance).

    So there is no 'perfect' caching algorithm.  There are simply too many
    variables even in a well defined environment for even the best system
    heuristics to cover optimally.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message