From owner-freebsd-hackers Thu Sep 23 12:37:46 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 8031315058 for ; Thu, 23 Sep 1999 12:37:34 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id MAA29808; Thu, 23 Sep 1999 12:37:27 -0700 (PDT) (envelope-from dillon) Date: Thu, 23 Sep 1999 12:37:27 -0700 (PDT) From: Matthew Dillon Message-Id: <199909231937.MAA29808@apollo.backplane.com> To: "Kenneth D. Merry" Cc: cmsedore@mailbox.syr.edu (Christopher Sedore), freebsd-hackers@FreeBSD.ORG Subject: Re: mbufs, external storage, and MFREE References: <199909231924.NAA06294@panzer.kdm.org> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG : :How about marking the page copy-on-write? That way, if the user modifies :the page while it is being transmitted, it'll just be copied, so the :original data will be intact. : :Ken If it were a normal page we could, but the VM system currently cannot handle pages associated with vnodes themselves being marked copy-on-write. This is kinda hard to explain, but I will try. When a process maps a file MAP_PRIVATE, the VM object held by the process is not actually a vnode. Instead it is holding what is called a default object. The default object shadows the VM object representing the vnode. When a fault occurs, vm_fault knows to copy-on-write the page from the read-only backing VM object to the the front VM object and so from the point of view of the process, the page is copy-on-write. From the system's point of view, a new page has been added to the default VM object and no changes have been made to the vnode's VM object. When a process maps a file MAP_PRIVATE or MAP_SHARED and doesn't touch any of the pages, and some other process goes in and write()'s to the file via a descriptor, the process's view of the file will change because the pages associated with the underlying vnode have changed. The problem that occurs when we try to optimize read by mapping a vnode's page into a user address space is that some other process may go and modify the underlying file, modifying the data that the user process sees *after* the read() has returned. But the user process is expecting that data not to change because it thinks it has read() it into a private buffer when, in fact, the OS optimized the read by replacing the private memory with the file map. i.e. our problem is not so much the user process making a change to its buffer -- that case is handled by copy-on-write, but of another process writing directly to the vnode causing the data the first process read() to appear to change in its buffer. -Matt Matthew Dillon :-- :Kenneth Merry :ken@kdm.org : To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message