Date: Fri, 14 Apr 1995 17:15:53 GMT From: "John S. Dyson" <toor@jsdinc.root.com> To: current@FreeBSD.org Subject: Info on VM/VFS changes since 4.4Lite Message-ID: <199504141715.RAA05096@jsdinc.root.com>
next in thread | raw e-mail | index | archive | help
Hey gang -- I thought that it would be nice to talk about the improvements to the FreeBSD VFS/VM since the 4.4-Lite code was given to us. I have been asked questions about the differences and improvements by people that are trying this stuff out. Even though the VM/VFS system is not really a user "visible" feature of the OS, notes about the differences can be useful to people choosing a platform. I really believe that low level kernel things such as this are only enabling technology. Most of these things were changed to allow people to use the system for more and bigger applications. Hopefully, some day some of us will get together and write a FreeBSD kernel manual. :-). Things fixed in the VM/VFS system since 4.4-Lite by various FreeBSD contributors 1) Collapse problem fully eliminated Fairly complex code has been added to eliminate the growing swap space problem intrinsic in the MACH VM system used in 4.4-Lite. You will notice that the system uses much less swap space than it used to. (Earlier versions of FreeBSD had mods to help the situation, but the code in 2.0.5 contains a complete fix.) 2) The pageout daemon is now very efficient The original pageout daemon was waken up gratuitously. When physical memory started being overcommited, the system would thrash. Also, the new FreeBSD pageout daemon does significant statistics on page usage, so that it doesn't free pages that are likely to be re-used. (The old one was too simple.) 3) Pages are not freed as often A new page queue that has pages that can be easily re-used by user processes was added. The identities of the pages on the queue are not lost until they are reused. We still keep a free queue for interrupt code use and for pages that have lost their identity. 4) The VM system now no longer gratuitiously wipes the page tables. When COW pages are created, previous usage is tracked at the VM level, making sure that gratuitious page protection is not done. This fix really helps large systems. 5) The VM system and buffer cache has been merged. Now mmap is fully coherent with the read/write system calls. This is an initial implementation, and the VOP_GETPAGE and VOP_PUTPAGE will be compatibly added soon (Probably V2.2). For example, a write to a file immediately causes the data to change immediately in the address space of a process that might have the file mapped. 6) Dynamic sized buffer cache Along with the merged VM/buffer cache, the buffer cache now uses otherwise unused memory. It does not compete with memory that is likely to be needed in the near future. Additionally, the new code does not create dirty pages not associated with buffers, thereby limiting the number of dirty vfs created pages to the size of the buffer cache. 7) The system now swaps. Swapping has historically been an unpleasant thing in UNIX-like OSes. Not only has FreeBSD implemented swapping, but has an intelligent policy as to the swappability of processes. 8) The VM code does many fewer copys. Unfortunately, the standard 4.4Lite VM code copies all data paged in from files. FreeBSD copies very little of the RO data paged in from files, the only time that the system copies paged-in pages is for COW. 9) Soft RSS limiting has been added. The system allows the system administrator to limit the RSS of processes. 10) The FreeBSD VM intelligently clusters pageins. Pageins are clustered VM-intelligent -- not limited to the VFS (I/O optimized) clustering methods. 11) Vastly improved the flushing of dirty vnode-backed pages Since mmap is more likely to be used now, it was necessary to create a more efficient pageout of dirty pages. 12) VFS_BIO bounce buffering has been added. A fairly architecture-neutral, non-invasive bounce-buffer scheme has been added to vfs_bio (actually vm_machdep for now.) Note that in general 1-3 lines of code needs to be added to each block device driver that needs bouncing. 13) More efficient ordering of buffers in the vnode dirty list Makes sync work better if there are lots of delayed write buffers. This is mostly helpful if one modifies the ufs_readwrite to retain delayed write buffers as opposed to immediately queueing async writes. 14) Much better vfs name caching. 15) New VFS cluster code. The original cluster code, although working, appeared to violate some layering and depended on a large kva space for the clustered I/O buffers. So for a large number of buffers, too much kva was required. Special buffers are now used to support clustering, thereby minimizing kva space requirements. This helps both CISC and some RISC architectures (such as R3000/R4000), where each 2MB or 4MB costs something significant (like page table pages or TLB entries.) John dyson@root.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199504141715.RAA05096>