Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 Jan 1999 12:36:57 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Ville-Pertti Keinonen <will@iki.fi>
Cc:        hackers@FreeBSD.ORG
Subject:   Re: Review and report of linux kernel VM
Message-ID:  <199901222036.MAA56617@apollo.backplane.com>
References:  <199901140720.XAA22609@apollo.backplane.com> <8690evpkc4.fsf@not.oeno.com>

next in thread | previous in thread | raw e-mail | index | archive | help
:Whaat?
:
:You appear to be confusing cleanliness (as I understand it, and I'm
:afraid that many other readers of your review might understand it)
:with simplicity.
:
:I would claim the exact opposite.  The Linux VM system is simpler, but
:far *less* clean because of the very inflexible (almost non-existent)
:"layers".  Not to mention the code, which shares the (IMHO) poor
:source organization and apparently arbitrary dependencies of Linux as
:a whole.

    The Linux VM system implements all the core features that the 
    FreeBSD VM system implements, just not as efficiently.  Its 
    use of a page table paradigm to do VM-specific object layering
    is really not that bad of an idea.  It *does* lock them into a more
    ridgid scheme ( for example, the linux scheme starts to break down
    when you share huge objects between processes ), but so far they've been
    able to implement the same core feature set that we have in our VM system.
    Thus, it is not possible to argue that their system is inferior from an 
    algorithmic standpoint, only from an implementation standpoint and a
    flexibility standpoint.

    We can hardly be proud of our VFS/BIO layering which has been so buggy
    these last few years.  The types of bugs I'm finding in FreeBSD have
    nothing to do with the algorithms and everything to do with the code
    being uncommented and virtually unreadable due to the hundreds of 
    badly thought out optimizations and other hacks that have obscured the
    core implementation.

    When I say clean, I mean 'readable, obvious, and functionallty layered'.
    I had no trouble following the linux code even going deep into the paging
    and VFS subsystems.  Following FreeBSD code has been like pulling nails.
    It's why we are *still* finding bugs in our VM system, after years of work.

    FreeBSD's VM system is definitely more flexible and efficient.  Given the
    choice, I would much rather keep FreeBSD's VM system.  That flexibility
    has come at the cost of dirtying up the code considerably, though.   What
    use is flexibility if every new feature brings half a dozen bugs to light
    and creates half a dozen more of its own? 

    My current work is to keep the flexibility while cleaning up the code.  
    If we can clean up the code, we will have a clean, flexible, AND kickass
    VM system rather then simply a kickass VM system.

:
:What's my definition of clean then?  For example, common operations
:shouldn't need to resort to brute-forceish approaches (not in many
:cases, anyhow).  Which reminds me, your swp_pager_meta_free_all looks
:a bit frightening...do you intend to keep it like it is?

    'efficiency'.  As I stated, Linux's VM code is not terribly efficient.
    I would disagree with the 'brute force' line, though.  They've stuck
    to their guns pretty well and the core concepts are sound.  Linux simply 
    has not had the long operational history that BSD has and they are having
    to relearn many of the same lessons.  

    It should be noted that linux can still implement inode-based object
    layering underneath their existing VM system.  Their direct use of
    pagetables for bookkeeping does not prevent that.

:In addition to the problems you stated, as far as I can tell, swap
:backing is not shared for copy-on-write associations (copy-on-write
:pages get swapped out multiple times, all but the last don't free any
:memory) unless the page was swapped out when the maps were copied, in
:which case it ends up copy-on-access...maybe, I'm not sure whether the
:swap cache eliminates this.
:
:This (and many of the things you pointed out) is due to the simplistic
:approach where pages don't really have an identity (only mappings)
:unless they are backed by an inode.  Which is perhaps at the core of
:most of the algorithmic differences between Mach/4.4BSD and Linux VM
:systems.
:
:IMHO pages need to have an identity even when they are not associated
:with files (based on a quick glance, NetBSD's UVM seems to retain
:this property while optimizing the management of anonymous pages.  I'm
:not convinced in terms of the choice of data structures for the anon
:maps in UVM, though).

    Pages under linux *DO* have an identity, but you have to look it up
    in the meta objects backing the page tables based on the position of the
    page in the page table.  They do not implement swap as a paging layer as
    we do, but then again our implementation of swap as a paging layer is
    a mostly degenerate case in our vm_object layering system so it amounts
    to pretty much the same thing.

    I don't think COW pages get swapped multiple times, but I could be wrong.
    My read is that when a linux process forks, the swap block associates are
    shared even for COW pages.  The COWed pages are marked read-only and 
    split if a write fault occurs.  Unless it's writing the same shared
    page from different processes to the same swap block over and over again,
    that is.  It shouldn't have to - I was under the impression that the 
    swap had a bunch of per-swap-block flags to keep track of the clean/dirty
    state, so once one process swaps out a page, the others may scan it but
    will not redundantly swap it out.

:>     reference information.  Instead, it uses the vm_object and pmap modules.
:>     I actually like this feature of FreeBSD.  A lot. 
:
:Additionally, the way FreeBSD does things has better potential for
:concurrency (even though the locks have been ripped out) compared to
:Linux.

    I disagree.  FreeBSD still must hold locks through pmap changes and those
    scan all related processes, just as linux does.  The difference is that
    since FreeBSD can delete page tables, it generally winds up scanning many
    FEWER processes to change the pmap state for a page then linux.  Linux
    must scan/adjust the pmap state for e very process mmap()ing the page
    whether or not it is using the page.

:>     Linux demarks interrupts from supervisor code much better then we do.
:
:You seem to consider simpler to mean cleaner/better.  Although in this
:case, I'd agree that much of the complexity of FreeBSD is unnecessary.
:

    My philosophy is, in general, that (1) one must separate the algorithm
    from the implementation and that (2) any algorithm can be cleanly 
    implemented.  If it isn't, it should be rewritten.  If the programmer
    can't reimplement it, either the programmer is unworthy of the algorithm
    or the programmer isn't experienced enough to do it right, or the
    algorithm is bad.  There is no middle ground in my world view.

    					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901222036.MAA56617>