From owner-freebsd-current@FreeBSD.ORG Wed Apr 28 20:05:13 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from green.homeunix.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 37EFA16A4CE; Wed, 28 Apr 2004 20:05:13 -0700 (PDT) Received: from localhost (green@localhost [127.0.0.1]) by green.homeunix.org (8.12.11/8.12.11) with ESMTP id i3T35CW6001661; Wed, 28 Apr 2004 23:05:12 -0400 (EDT) (envelope-from green@green.homeunix.org) Message-Id: <200404290305.i3T35CW6001661@green.homeunix.org> X-Mailer: exmh version 2.6.3 04/04/2003 with nmh-1.0.4 To: Alan Cox In-Reply-To: Message from Alan Cox of "Wed, 28 Apr 2004 21:24:01 CDT." <20040429022401.GJ5199@cs.rice.edu> From: Brian Fundakowski Feldman Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 28 Apr 2004 23:05:12 -0400 Sender: green@green.homeunix.org cc: alc@FreeBSD.org cc: bms@FreeBSD.org cc: current@FreeBSD.org Subject: Re: VM wiring fixed X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Apr 2004 03:05:13 -0000 Alan Cox wrote: > On Wed, Apr 28, 2004 at 08:08:17PM -0400, Brian Fundakowski Feldman wrote: > > There are several severe wiring bugs in -CURRENT that I believe I have fixed. > > Please test/review as appropriate if you're affected by any of them. This > > is a superset of the previous patch which just mostly-fixed mlockall(2). > > > > * MAP_FUTUREWIRE was not unset in vmspace_dofree(), causing the next process > > to use that vmspace to wire all of its memory. > > Would setting flags to 0 in vmspace_alloc() accomplish the same? Yeah, same deal, except since we're optimizing by only clearing/ reinitializing some fields for the vmspace-zone objects, it makes a little more sense IMO to have it in the vmspace_dofree(). > > * kmem_*() calls either called vm_map_wire() or set MAP_ENTRY_NOFAULT, but > > kmem_free() did not undo the vm_map_wire() calls. > > I don't understand this comment. kmem_free() calls vm_map_remove(), which > calls vm_map_delete(), which calls vm_map_entry_unwire(), etc. > > However, I don't see any change corresponding to this statement in your > patch. I changed kmem_free() to do an unwire since it is the interface for unmapping wired memory, as opposed to kmem_free_wakeup(). > > * vm_fault_unlock() could not unwire pages that were not in the pmap already, > > leaking them permanently. > > By definition a wired mapping is in the pmap. A wired page can, however, > be mapped by a non-wired mapping. Thus, a non-wired mapping of a wired > page could be absent from the pmap. > > Problems such as PR/29915 are a result of vm_fault_unwire() not understanding > fake pages created by the device pager. Pmap_extract() followed by > PHYS_TO_VM_PAGE() does not result in a pointer to the fake page that was > wired. Thus, the panic on unexpected wiring count. > > Your patch to vm_fault_unwire() does look like a correct fix to this issue. I didn't realize the problem was so old! > > * vm_map_{un,}wire() did not keep track of wirings as they should. User > > wirings are separate from system wirings, and there can be exactly one. > > There can be unlimited system wirings, but wired_count will remain at > > zero for map entries that are allocated as MAP_ENTRY_NOFAULT. > > MAP_ENTRY_NOFAULT means that the vm_map_entry has no backing vm_object and > that you should never fault on it, which would create a backing object. > vm_map_{un,}wire(), however, deal with map entries that do have backing > objects. I'm not quite sure why you chose to introduce MAP_ENTRY_NOFAULT > into the implementation of vm_map_{un,}wire(). > > I'll have to look at this more carefully. I've done it the way I explained to preserve layering of vmspace -> map -> entry -> page as best I knew how. The kmem_free() implementation doesn't know if the entries are MAP_ENTRY_NOFAULT or not, so I pushed the intelligence into the wiring routines as to what that means. > > * vslock()/vsunlock() did not both use VM_MAP_WIRE_SYSTEM as they must; > > I believe this resulted in more wiring leaks. > > * vm_map_delete() did not wait for all wirings (except one user wiring) to > > drain, so vslock() guaranteed nothing. > > It is not vm_map_delete()'s responsibility to wait for all wirings to be > drained. It is only responsible for a single wired mapping. How can it be done, then? The vslock() function is called with the expectation that the memory it references is both "wired" and "held". I can't see any other to do it -- there is no refcount on entries. Since vslock() is the only interface for system-wiring arbitrary memory, there will always be a matching call to vsunlock() which causes the system unwire to happen and the waiting vm_map_delete() to wake up. > > * The condition in vm_fault() where all pages have been exhausted is easy to > > deadlock, but was impossible to recover from. The OOM killer works only > > when all memory has been used, not all wired memory. However, now it is > > possible to kill offending processes with SIGKILL instead of the > > vm_fault() in trap_pfault() looping forever. Would you rather see this or an "OOM-wired-killer"? > > * The init(8) program should really be using mlockall(2) so that it can kill > > off processes hogging all the wired memory. However, I have not fixed this > > because in such case, e.g. while I would like for Ctrl+Alt+Del to work, > > init(8) may actually need to allocate and wire new pages itself to keep > > running. I think a way to fix this is to conditionalize the > > vm_page_count_severe() condition on p->p_pid != 1 so that just like the > > REAL system processes, it can bring the page count lower than "severe". > > > > I haven't tested it out on SMP yet, but on UP the latest changes don't seem > > to have any negative effects. All wired memory leaks appear to be gone and > > although init(8) probably can't do it, I can enter DDB and "kill 9 " > > to take the machine out of an all-wired deadlock. See patch at URL: > > > > > > In summary, consider this a positive review for the MAP_FUTUREWIRE and > the vm_fault_unwire() fixes. Thanks! -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\