From owner-freebsd-hackers Thu Jan 30 17:29: 3 2003 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2FB7F37B401 for ; Thu, 30 Jan 2003 17:29:02 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id AA98B43F43 for ; Thu, 30 Jan 2003 17:29:01 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0V1Sx0i091269; Thu, 30 Jan 2003 17:28:59 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0V1Sxo5091268; Thu, 30 Jan 2003 17:28:59 -0800 (PST) Date: Thu, 30 Jan 2003 17:28:59 -0800 (PST) From: Matthew Dillon Message-Id: <200301310128.h0V1Sxo5091268@apollo.backplane.com> To: "Brian T. Schellenberger" Cc: kientzle@acm.org, Sean Hamilton , hackers@FreeBSD.ORG Subject: Re: Random disk cache expiry References: <000501c2c4dd$f43ed450$16e306cf@slugabed.org> <200301302222.h0UMMfFI090349@apollo.backplane.com> <3E39BE22.8050207@acm.org> <200301301937.12407.bschellenberger@nc.rr.com> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :I think you missed Matt's point, which is well-taken: : :Even if everybody accesses it sequentially, if you have 100 processes :accessing it sequentially at the *same* time, then it would be to your :benefit to leave the "old" pages around because even though *this* :process won't access it again, the *next* process very well might, if :it just happens to be reading it sequentially as well but is a little :further behind on its sequential read. Right, and if the same 100 processes are accessing N files sequentially instead of just one, you get a different effect... you might blow out your cache if the aggregate size of the N files is too large. But then if some of those processes are accessing the same file, and other processes were accessing different files, you might want to cache that file but possibly not cache others, even though all the files are (for this example) the same size. But then what if some of the processes accessing some of those other files were from slow clients? You could get away with not caching those files and then you might possibly be able cache all the remaining files (being accessed by faster clients). And so on, and so forth. It gets even more complicated when you throw in read verses write verses read-modify-write accesses, and even more complicated when you add other load factors (e.g. the sheer number of connections might reduce the memory available for caching on the fly, or trying to balance executable pages verses data pages to optimize paging and program performance). So there is no 'perfect' caching algorithm. There are simply too many variables even in a well defined environment for even the best system heuristics to cover optimally. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message