Date: Sat, 26 Dec 1998 16:07:29 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: cvs-all@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: new swap system work Message-ID: <199812270007.QAA33903@apollo.backplane.com>
next in thread | raw e-mail | index | archive | help
After a number of long conversations with John Dyson in regards to
memory deadlock issues and related kernel hacks used to get around
them, I've decided to embark on a major project to revamp the
vm/pager_* code to allow paging to occur in zero-memory-free
situations.
This work is going to be mostly concentrated in vm/swap_pager.c but
will have a ripple effect throughout the core VM system (vm/*.c) and
the memory subsystem.
The work is going to be broken down into two parts:
* rewriting the swap_pager.
* filesystem / VOP_STRATEGY work to guarentee that all VOP_STRATEGY()
calls for all filesystems and devices will operate without a
memory deadlock occuring in zero-free-memory situations.
The first part I'm working on now and expect to commit sometime POST 3.0.1.
We'll see how long it takes me to get it solid.
Basically, fixing the swap system requires moving the allocation of the
swap metadata structures out of the pageout code. To accomplish this,
vm_page_t will get a new field, called 'swapblk'. All swap-backed
memory-resident pages will have their swap blocks stored in the vm_page_t
rather then the swap-metadata structure. Swap blocks assigned to
resident pages do not have to be moved into the object swap metadata
structures until the page is actually freed (at which point there is
free memory available to allocate the swap metadata structure, hence
the ability to operate in a zero-free-page environment).
The side effects of doing this are all beneficial. The VM system becomes
more swap-aware and doesn't have to worry about free memory as much.
A great deal of simplification can be done all over place. These
simplifications will take longer to accomplish since my goal is to get
the thing working first, but I think the long term prospects are very
good. Eventually we should be able to page out swap metadata associated
with active processes (but that's a long ways off). The raw swap
allocation / deallocation code (the rlist stuff) will also eventually be
rewritten to remove the memory blocking constraints that rlist_free
currently has and to make it possible to remove swap.
-
I'll start work on the second part after I finish the first part. Fixing
VOP_STRATEGY basically involves giving each device or filesystem its own
guarenteed pool of N private pages (e.g. like 5 or so per active device
or mount). The device drivers will then be modified such that they
are able to guarentee operation without memory deadlock when operating
solely out of their private pool (i.e. when no system global free pages
are available). So, for example, a VOP_*() call could still block on
memory, but the use of the private pool means it would be guarenteed
to unblock sometime later as other I/O's in progress on that device
complete and free private pool memory. The system's global free pool
could then be reduced accordingly, an overall wash. Still, I think the
use of private pools will actually make low-memory FreeBSD configurations
more efficient.
Fixing VOP_STRATEGY() and the swapper will together allow reliable
paging to files and remove memory deadlock issues related to VFS
layering (e.g. like mounting a vn partition on top of NFS and then
mounting a filesystem through that) - though even so there are still a
number of deadlock issues still remaining in the VFS layering department.
-Matt
Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet
Communications & God knows what else.
<dillon@backplane.com> (Please include original email in any response)
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812270007.QAA33903>
