From owner-freebsd-hackers Thu Jan 30 16: 7:44 2003 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7E4E437B401 for ; Thu, 30 Jan 2003 16:07:42 -0800 (PST) Received: from kientzle.com (h-66-166-149-50.SNVACAID.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 94D0843F79 for ; Thu, 30 Jan 2003 16:07:40 -0800 (PST) (envelope-from kientzle@acm.org) Received: from acm.org (big.x.kientzle.com [66.166.149.54]) by kientzle.com (8.11.3/8.11.3) with ESMTP id h0V074R31167; Thu, 30 Jan 2003 16:07:04 -0800 (PST) (envelope-from kientzle@acm.org) Message-ID: <3E39BE22.8050207@acm.org> Date: Thu, 30 Jan 2003 16:06:58 -0800 From: Tim Kientzle Reply-To: kientzle@acm.org User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:0.9.6) Gecko/20011206 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Matthew Dillon Cc: "Brian T. Schellenberger" , Sean Hamilton , hackers@FreeBSD.ORG Subject: Re: Random disk cache expiry References: <000501c2c4dd$f43ed450$16e306cf@slugabed.org> <200301270019.44066.bschellenberger@nc.rr.com> <3E34C734.8010801@acm.org> <200301270904.43899.bschellenberger@nc.rr.com> <200301302222.h0UMMfFI090349@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > Your idea of 'sequential' access cache restriction only > works if there is just one process doing the accessing. Not necessarily. I suspect that there is a strong tendency to access particular files in particular ways. E.g., in your example of a download server, those files are always read sequentially. You can make similar assertions about a lot of files: manpages, gzip files, C source code files, etc, are "always" read sequentially. If a file's access history were stored as a "hint" associated with the file, then it would be possible to make better up-front decisions about how to allocate cache space. The ideal would be to store such hints on disk (maybe as an extended attribute?), but it might also be useful to cache them in memory somewhere. That would allow the cache-management code to make much earlier decisions about how to handle a file. For example, if a process started to read a 10GB file that has historically been accessed sequentially, you could immediately decide to enable read-ahead for performance, but also mark those pages to be released as soon as they were read by the process. FWIW, a web search for "randomized caching" yields some interesting reading. Apparently, there are a few randomized cache-management algorithms for which the mathematics work out reasonably well, despite Terry's protestations to the contrary. ;-) I haven't yet found any papers describing experiences with real implementations, though. If only I had the time to spend poring over FreeBSD's cache-management code to see how these ideas might actually be implemented... Tim Kientzle To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message