Skip site navigation (1)Skip section navigation (2)
Date:      24 Jan 1999 11:26:14 -0000
From:      Ville-Pertti Keinonen <will@iki.fi>
To:        dillon@apollo.backplane.com
Cc:        hackers@FreeBSD.ORG
Subject:   Re: Review and report of linux kernel VM
Message-ID:  <19990124112614.2132.qmail@ns.oeno.com>
In-Reply-To: <199901222036.MAA56617@apollo.backplane.com> (message from Matthew Dillon on Fri, 22 Jan 1999 12:36:57 -0800 (PST))

next in thread | previous in thread | raw e-mail | index | archive | help

>     The Linux VM system implements all the core features that the 
>     FreeBSD VM system implements, just not as efficiently.  Its 
>     use of a page table paradigm to do VM-specific object layering
>     is really not that bad of an idea.  It *does* lock them into a more
>     ridgid scheme ( for example, the linux scheme starts to break down
>     when you share huge objects between processes ), but so far they've been
>     able to implement the same core feature set that we have in our VM system.
>     Thus, it is not possible to argue that their system is inferior from an 
>     algorithmic standpoint, only from an implementation standpoint and a
>     flexibility standpoint.

I don't think that any working implementation is inherently superior
or inferior, it all depends on what you consider important, apparently
the Linux folks consider it important to avoid adding extra data
structures that appear expensive and/or redundant (given that you have
page tables anyhow, considering further objects to hold pages probably
results in a gut reaction of "that's wasteful!" in most programmers,
if further thought is not given to what advantages there could be) at
the cost of requiring the code to do more work or having less scalable
algorithms.

>     We can hardly be proud of our VFS/BIO layering which has been so buggy
>     these last few years.  The types of bugs I'm finding in FreeBSD have

VFS/BIO are different from VM, although I'd agree with the recent
suggestions that perhaps they shouldn't be.

>     When I say clean, I mean 'readable, obvious, and functionallty layered'.
>     I had no trouble following the linux code even going deep into the paging
>     and VFS subsystems.  Following FreeBSD code has been like pulling nails.

Sure it's easy to understand, it's simple, but I tend to think that a
lot of things are done in the wrong place (in terms of layering), data
is accessed in inconsistent ways etc. which is why I wouldn't call it
"clean".  Finding the places where things are actually done is also
often difficult.

>     It's why we are *still* finding bugs in our VM system, after years of work.
>     FreeBSD's VM system is definitely more flexible and efficient.  Given the
>     choice, I would much rather keep FreeBSD's VM system.  That flexibility
>     has come at the cost of dirtying up the code considerably, though.   What
>     use is flexibility if every new feature brings half a dozen bugs to light
>     and creates half a dozen more of its own? 

I find the FreeBSD VM (and other) code readable and moderately well
organized (easier to find specific things than in Linux).

It isn't hard to understand the code but it can be hard to understand
the system as a whole as long as you don't understand the
relationships between vm_objects (or even what a vm_object
represents), which are not exactly obvious.

In terms of bugs, Linux is probably better off because many of the
core subsystems are maintained by their original author.  And for
simpler algorithms, it's easier for someone to take over a subsystem
quickly.

Even if you do understand how something works, if you didn't write it
(or wrote it a sufficiently long time ago) or haven't studied it
thoroughly, it's difficult to keep all of the possible implications of
a modification in mind.

>     Pages under linux *DO* have an identity, but you have to look it up
>     in the meta objects backing the page tables based on the position of the
>     page in the page table.  They do not implement swap as a paging layer as

They do have an identity, but not a unique one (until a swap
allocation becomes the identity), and as far as I can tell, given a
page, you can't find *any* mappings without a brute-force search.

>     I don't think COW pages get swapped multiple times, but I could be wrong.
>     My read is that when a linux process forks, the swap block associates are
>     shared even for COW pages.  The COWed pages are marked read-only and 

Yes, if it was swapped at fork-time.  If not, it takes several scans
of different processes to get it out of memory.  It doesn't take
several actual writes because of the swap cache, so it's not quite as
bad as it could be.

>     split if a write fault occurs.  Unless it's writing the same shared
>     page from different processes to the same swap block over and over again,
>     that is.  It shouldn't have to - I was under the impression that the 
>     swap had a bunch of per-swap-block flags to keep track of the clean/dirty
>     state, so once one process swaps out a page, the others may scan it but
>     will not redundantly swap it out.

I believe that it has only a reference count to allow the shared state
to be paged in with the page.  Unlike I thought at first, the swap
cache permanently maps relationships between physical pages and swap
blocks, so no extra copying is done.

> :Additionally, the way FreeBSD does things has better potential for
> :concurrency (even though the locks have been ripped out) compared to
> :Linux.

>     I disagree.  FreeBSD still must hold locks through pmap changes and those
>     scan all related processes, just as linux does.  The difference is that

FreeBSD could, for example, potentially service multiple page faults
on a vm_map simultaneously (exclusive locks are required for some of
the lower-level layers such as vm_objects and the pmap, but those only
require short-term locks).  In Mach, the VM system actually did do
this.

>     since FreeBSD can delete page tables, it generally winds up scanning many
>     FEWER processes to change the pmap state for a page then linux.  Linux
>     must scan/adjust the pmap state for e very process mmap()ing the page
>     whether or not it is using the page.

In FreeBSD in order to scan the pmaps of different processes mapping a
page, you should only need to lock pmaps, which are a low-level layer
and are (or should be) only locked for short periods of time.

Of course my concerns seem weird in the context of FreeBSD/Linux
because they include multithreading, fine-grained locking, kernel-mode
pre-emption and real-time properties, much of which neither system is
likely to implement properly in the near future.

>     My philosophy is, in general, that (1) one must separate the algorithm
>     from the implementation and that (2) any algorithm can be cleanly 
>     implemented.  If it isn't, it should be rewritten.  If the programmer

I agree, I just don't agree with your interpretations of what's
"clean".  And seeing linux called "clean" just seemed so completely
opposite to how I see it that I couldn't not comment...  The FreeBSD
VM code isn't totally clean, either, but I would certainly not say
that it is less clean than linux.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990124112614.2132.qmail>