Date: Mon, 10 Dec 2001 01:18:51 +0100 From: Uwe Doering <gemini@geminix.org> To: FreeBSD-gnats-submit@freebsd.org Cc: gemini@geminix.org Subject: kern/32659: VM and VNODE leak with vm.swap_idle_enabled=1 Message-ID: <E16DE9j-000Neh-00@geminix.geminix.org>
next in thread | raw e-mail | index | archive | help
>Number: 32659 >Category: kern >Synopsis: VM and VNODE leak with vm.swap_idle_enabled=1 >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Dec 09 16:20:00 PST 2001 >Closed-Date: >Last-Modified: >Originator: Uwe Doering >Release: FreeBSD 4.4-STABLE i386 >Organization: Private UNIX Site >Environment: System: FreeBSD srv00.private.geminix.org 4.4-STABLE FreeBSD 4.4-STABLE #23: Sun Dec 9 11:14:31 MET 2001 root@srv00.private.geminix.org:/usr/src/sys/compile/SERVER i386 SMP kernel on ServerWorks chipset >Description: In /usr/src/sys/vm/vm_glue.c (around line 482) there are two if() clauses that deal with VM_SWAP_IDLE, the flag triggered by "sysctl vm.swap_idle_enabled=1". For this code to actually work the two if() clauses need to be logical mirrors. However, they aren't. The case "p->p_slptime == swap_idle_threshold2" isn't covered at all. Since we have a granularity of one second here it won't take a busy system too long to pass the code with both variables equal. What happens then is that no swap-out takes place and the for() loop just iterates to the next process. However, since we already incremented "vm->vm_refcnt" by one at this point and don't drop it by calling vmspace_free() we effectively have a stuck VM object now. When the process later exits this also prevents the VNODE of the backing object from being released, so we have a stuck VNODE as well. In my case it was "/bin/sh". Originally, I started my investigation because I couldn't "umount" the respective filesystem without "-f", for no apparent reason. I found out that the VNODE of "/bin/sh" had a usage count of 12, without any matching processes running. After some days of searching I finally traced it back to the code location given above. >How-To-Repeat: Set "vm.swap_idle_enabled=1" and run "make -j4 -DMAKE_KERBEROS4 -DMAKE_KERBEROS5 buildworld" in an endless loop for some hours. Then try to "umount" the filesystem. Normally this is a problem since I think you can't "umount" the root filesystem where "/bin/sh" resides on an active system. So instead you may want to try it via the VN driver. I used one of these file based filesystems containing a system set up for jail() and actually ran the "buildworld" loop inside a jail. Then it shouldn't be a problem to "umount" the filesystem and see how it balks. >Fix: When you know where to look the reason for the problem and its fix are obvious. I suggest fixing it in a way outlined by the patch below. With that change the problem went away and I didn't notice any side effects. --- vm_glue.c.orig Wed Nov 21 02:43:57 2001 +++ vm_glue.c Fri Dec 7 16:42:09 2001 @@ -482,17 +482,14 @@ } vm_map_unlock(&vm->vm_map); /* - * If the process has been asleep for awhile and had - * most of its pages taken away already, swap it out. + * All possibilities of sparing the process its + * grim fate have been exhausted above, so bow + * to the inevitable now and swap it out. */ - if ((action & VM_SWAP_NORMAL) || - ((action & VM_SWAP_IDLE) && - (p->p_slptime > swap_idle_threshold2))) { - swapout(p); - vmspace_free(vm); - didswap++; - goto retry; - } + swapout(p); + vmspace_free(vm); + didswap++; + goto retry; } } /* >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E16DE9j-000Neh-00>