From owner-freebsd-hackers Tue May 14 21:16:01 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id VAA13778 for hackers-outgoing; Tue, 14 May 1996 21:16:01 -0700 (PDT) Received: from dyson.iquest.net (dyson.iquest.net [198.70.144.127]) by freefall.freebsd.org (8.7.3/8.7.3) with ESMTP id VAA13762; Tue, 14 May 1996 21:15:56 -0700 (PDT) Received: (from root@localhost) by dyson.iquest.net (8.7.5/8.6.9) id XAA19042; Tue, 14 May 1996 23:15:29 -0500 (EST) From: "John S. Dyson" Message-Id: <199605150415.XAA19042@dyson.iquest.net> Subject: Re: A question for the VM gurus..! To: jgreco@brasil.moneng.mei.com (Joe Greco) Date: Tue, 14 May 1996 23:15:29 -0500 (EST) Cc: dyson@freebsd.org, hackers@freebsd.org In-Reply-To: <199605150255.VAA11471@brasil.moneng.mei.com> from "Joe Greco" at May 14, 96 09:55:42 pm X-Mailer: ELM [version 2.4 PL24 ME8] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > Hi John, > > > Mincore is supported in current (but only approximately correctly -- it returns > > which pages are mapped in a process, not actual residency.) It says in essence, > > which pages don't have to be faulted... I have been planning on fixing that. > > I think I'm going to have to reveal my stupidity here. :-( > YOU ARE NOT STUPID :-). > > I do not understand your statement. I would assume that "pages [that] are > mapped in a process... [and] ...don't have to be faulted" would necessarily > have to be resident. > What you say is right, but there are pages that are resident that are NOT mapped into the process address space. The VM system doesn't modify the pte's until the process faults them. (That is not strictly true, but is true in the case that I think that you are talking about.) So what I was trying to say is that mincore would miss some of those pages that are really in memory. > > Now given 200 readers and 2,000,000 articles, the likelihood that any > particular reader will access an article that someone else has recently read > is fairly low. This data is ripe for being discarded ASAP, and it would be > handy to tag these pages as "noncacheable" or "cacheable but VERY > discardable". True, currently file accesses are file the open()/close() > interface, but it is easy to mmap() the articles instead. Further, I can > even mark the pages as MADV_SEQUENTIAL (only useful on large articles I > would think), although I don't know how useful this hint would be to the VM > system. > How's about an ioctl or somesuch as a hint to the filesystem so that when a file is closed, it's pages (or object), is marked somehow for quick reuse (freeing?) You then could keep the read/write code, but an ioctl (or fcntl) could be issued to change the behavior. (Note that I still plan to do the madvise thing though :-)). > My other "pet project" requires a functional mincore() - the history file on > a large news server may be 150-200MB, and I would like to create a daemon to > handle history lookup requests. The file can be mapped and marked with > MADV_RANDOM, and when a request comes in requiring a particular bit of data, > pages that are found to be !mincore() can be marked with MADV_WILLNEED to > ask the VM system to bring them in, while the code goes on to service other > requests. I am trying to allow the process to spin through its connection > list as rapidly as possible, and with 64MB or 128MB RAM, you can hopefully > see how this would be very efficient (from an overall viewpoint). > MADV_RANDOM would probably implemented by bringing in only one page at a time, instead of a cluster. MADV_WILLNEED is problematical (a bit more difficult) since we don't currently have a way to asynchronously read pages in -- but it wouldn't be very hard. I have been looking into the possiblity of adding kernel threads -- that could help the async VM read problem. John