From owner-freebsd-hackers Sun Jan 26 20:55:42 2003 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AE8F537B401 for ; Sun, 26 Jan 2003 20:55:40 -0800 (PST) Received: from priv-edtnes62.telusplanet.net (outbound01.telus.net [199.185.220.220]) by mx1.FreeBSD.org (Postfix) with ESMTP id BA96843E4A for ; Sun, 26 Jan 2003 20:55:39 -0800 (PST) (envelope-from sh@bel.bc.ca) Received: from reason ([207.6.227.22]) by priv-edtnes62.telusplanet.net (InterMail vM.5.01.05.17 201-253-122-126-117-20021021) with SMTP id <20030127045539.UGBL1598.priv-edtnes62.telusplanet.net@reason> for ; Sun, 26 Jan 2003 21:55:39 -0700 Message-ID: <001801c2c5c0$5666de10$16e306cf@slugabed.org> From: "Sean Hamilton" To: References: <000501c2c4dd$f43ed450$16e306cf@slugabed.org> <200301261931.h0QJVCp8052101@apollo.backplane.com> <3E348B51.6F4D6096@mindspring.com> <200301270142.h0R1guR3070182@apollo.backplane.com> <3E3494CC.5895492D@mindspring.com> <3E34A6BB.2090601@acm.org> Subject: Re: Random disk cache expiry Date: Sun, 26 Jan 2003 20:55:39 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG ----- Original Message ----- From: "Tim Kientzle" | Cycling through large data sets is not really that uncommon. | I do something like the following pretty regularly: | find /usr/src -type f | xargs grep function_name | | Even scanning through a large dataset once can really hurt | competing applications on the same machine by flushing | their data from the cache for no gain. I think this | is where randomized expiration might really win, by reducing the | penalty for disk-cache-friendly applications who are competing | with disk-cache-unfriendly applications. In my case I have a webserver serving up a few dozen files of about 10 MB each. While yes it is true that I could purchase more memory, and I could purchase more drives and stripe them, I am more interested in the fact that this server is constantly grinding away because it has found a weakness in the caching algorithm. After further thought, I propose something much simpler: when the kernel is hinted that access will be sequential, it should stop caching when there is little cache space available, instead of throwing away old blocks, or be much more hesitant to throw away old blocks. Consider that in almost all cases where access is sequential, as reading continues, the chances of the read being aborted increase: ie, users downloading files, directory tree traversal, etc. Since the likelihood of the first byte reading the first byte is very high, and the next one less high, and the next less yet, etc, it seems to make sense to tune the caching algorithm to accomodate this. While discussing disks, I have a minor complaint: at least on IDE systems, when doing something like an untar, the entire system is painfully unresponsive, even though CPU load is low. I presume this is because when an executable is run, it needs to sit and wait for the disk. Wouldn't it make sense to give very high disk priority to executables? Isn't that worth the extra seeks? sh To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message