Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Dec 1998 16:07:29 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        cvs-all@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG
Subject:   new swap system work
Message-ID:  <199812270007.QAA33903@apollo.backplane.com>

next in thread | raw e-mail | index | archive | help
    After a number of long conversations with John Dyson in regards to
    memory deadlock issues and related kernel hacks used to get around
    them, I've decided to embark on a major project to revamp the 
    vm/pager_* code to allow paging to occur in zero-memory-free
    situations.  

    This work is going to be mostly concentrated in vm/swap_pager.c but
    will have a ripple effect throughout the core VM system (vm/*.c) and
    the memory subsystem.

    The work is going to be broken down into two parts:

	* rewriting the swap_pager.

	* filesystem / VOP_STRATEGY work to guarentee that all VOP_STRATEGY()
	  calls for all filesystems and devices will operate without a
	  memory deadlock occuring in zero-free-memory situations.

    The first part I'm working on now and expect to commit sometime POST 3.0.1.
    We'll see how long it takes me to get it solid.

    Basically, fixing the swap system requires moving the allocation of the
    swap metadata structures out of the pageout code.  To accomplish this,
    vm_page_t will get a new field, called 'swapblk'.  All swap-backed 
    memory-resident pages will have their swap blocks stored in the vm_page_t
    rather then the swap-metadata structure.   Swap blocks assigned to 
    resident pages do not have to be moved into the object swap metadata
    structures until the page is actually freed (at which point there is
    free memory available to allocate the swap metadata structure, hence
    the ability to operate in a zero-free-page environment).

    The side effects of doing this are all beneficial.  The VM system becomes
    more swap-aware and doesn't have to worry about free memory as much.
    A great deal of simplification can be done all over place.  These 
    simplifications will take longer to accomplish since my goal is to get
    the thing working first, but I think the long term prospects are very 
    good.  Eventually we should be able to page out swap metadata associated
    with active processes (but that's a long ways off).  The raw swap
    allocation / deallocation code (the rlist stuff) will also eventually be 
    rewritten to remove the memory blocking constraints that rlist_free 
    currently has and to make it possible to remove swap.

    -

    I'll start work on the second part after I finish the first part.  Fixing
    VOP_STRATEGY basically involves giving each device or filesystem its own
    guarenteed pool of N private pages (e.g. like 5 or so per active device
    or mount).  The device drivers will then be modified such that they 
    are able to guarentee operation without memory deadlock when operating
    solely out of their private pool (i.e. when no system global free pages 
    are available).  So, for example, a VOP_*() call could still block on
    memory, but the use of the private pool means it would be guarenteed
    to unblock sometime later as other I/O's in progress on that device 
    complete and free private pool memory.  The system's global free pool
    could then be reduced accordingly, an overall wash.  Still, I think the
    use of private pools will actually make low-memory FreeBSD configurations
    more efficient.

    Fixing VOP_STRATEGY() and the swapper will together allow reliable
    paging to files and remove memory deadlock issues related to VFS
    layering (e.g. like mounting a vn partition on top of NFS and then
    mounting a filesystem through that) - though even so there are still a
    number of deadlock issues still remaining in the VFS layering department.

							-Matt

    Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet 
                    Communications & God knows what else.
    <dillon@backplane.com> (Please include original email in any response)    

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812270007.QAA33903>