Date: Mon, 10 Dec 2001 01:18:51 +0100 From: Uwe Doering <gemini@geminix.org> To: FreeBSD-gnats-submit@freebsd.org Cc: gemini@geminix.org Subject: kern/32659: VM and VNODE leak with vm.swap_idle_enabled=1 Message-ID: <E16DE9j-000Neh-00@geminix.geminix.org>
next in thread | raw e-mail | index | archive | help
>Number: 32659
>Category: kern
>Synopsis: VM and VNODE leak with vm.swap_idle_enabled=1
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sun Dec 09 16:20:00 PST 2001
>Closed-Date:
>Last-Modified:
>Originator: Uwe Doering
>Release: FreeBSD 4.4-STABLE i386
>Organization:
Private UNIX Site
>Environment:
System: FreeBSD srv00.private.geminix.org 4.4-STABLE FreeBSD 4.4-STABLE #23: Sun Dec 9 11:14:31 MET 2001 root@srv00.private.geminix.org:/usr/src/sys/compile/SERVER i386
SMP kernel on ServerWorks chipset
>Description:
In /usr/src/sys/vm/vm_glue.c (around line 482) there are two
if() clauses that deal with VM_SWAP_IDLE, the flag triggered by
"sysctl vm.swap_idle_enabled=1".
For this code to actually work the two if() clauses need to be logical
mirrors. However, they aren't. The case
"p->p_slptime == swap_idle_threshold2" isn't covered at all. Since we
have a granularity of one second here it won't take a busy system
too long to pass the code with both variables equal.
What happens then is that no swap-out takes place and the for() loop
just iterates to the next process. However, since we already
incremented "vm->vm_refcnt" by one at this point and don't drop it by
calling vmspace_free() we effectively have a stuck VM object now.
When the process later exits this also prevents the VNODE of
the backing object from being released, so we have a stuck VNODE
as well. In my case it was "/bin/sh".
Originally, I started my investigation because I couldn't "umount"
the respective filesystem without "-f", for no apparent reason.
I found out that the VNODE of "/bin/sh" had a usage count of 12,
without any matching processes running. After some days of searching
I finally traced it back to the code location given above.
>How-To-Repeat:
Set "vm.swap_idle_enabled=1" and run
"make -j4 -DMAKE_KERBEROS4 -DMAKE_KERBEROS5 buildworld" in
an endless loop for some hours. Then try to "umount" the filesystem.
Normally this is a problem since I think you can't "umount" the root
filesystem where "/bin/sh" resides on an active system. So instead
you may want to try it via the VN driver. I used one of these file
based filesystems containing a system set up for jail() and actually
ran the "buildworld" loop inside a jail. Then it shouldn't be a
problem to "umount" the filesystem and see how it balks.
>Fix:
When you know where to look the reason for the problem and its fix
are obvious. I suggest fixing it in a way outlined by the patch
below. With that change the problem went away and I didn't notice
any side effects.
--- vm_glue.c.orig Wed Nov 21 02:43:57 2001
+++ vm_glue.c Fri Dec 7 16:42:09 2001
@@ -482,17 +482,14 @@
}
vm_map_unlock(&vm->vm_map);
/*
- * If the process has been asleep for awhile and had
- * most of its pages taken away already, swap it out.
+ * All possibilities of sparing the process its
+ * grim fate have been exhausted above, so bow
+ * to the inevitable now and swap it out.
*/
- if ((action & VM_SWAP_NORMAL) ||
- ((action & VM_SWAP_IDLE) &&
- (p->p_slptime > swap_idle_threshold2))) {
- swapout(p);
- vmspace_free(vm);
- didswap++;
- goto retry;
- }
+ swapout(p);
+ vmspace_free(vm);
+ didswap++;
+ goto retry;
}
}
/*
>Release-Note:
>Audit-Trail:
>Unformatted:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E16DE9j-000Neh-00>
