Date: Fri, 22 Jan 1999 12:36:57 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Ville-Pertti Keinonen <will@iki.fi> Cc: hackers@FreeBSD.ORG Subject: Re: Review and report of linux kernel VM Message-ID: <199901222036.MAA56617@apollo.backplane.com> References: <199901140720.XAA22609@apollo.backplane.com> <8690evpkc4.fsf@not.oeno.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:Whaat?
:
:You appear to be confusing cleanliness (as I understand it, and I'm
:afraid that many other readers of your review might understand it)
:with simplicity.
:
:I would claim the exact opposite. The Linux VM system is simpler, but
:far *less* clean because of the very inflexible (almost non-existent)
:"layers". Not to mention the code, which shares the (IMHO) poor
:source organization and apparently arbitrary dependencies of Linux as
:a whole.
The Linux VM system implements all the core features that the
FreeBSD VM system implements, just not as efficiently. Its
use of a page table paradigm to do VM-specific object layering
is really not that bad of an idea. It *does* lock them into a more
ridgid scheme ( for example, the linux scheme starts to break down
when you share huge objects between processes ), but so far they've been
able to implement the same core feature set that we have in our VM system.
Thus, it is not possible to argue that their system is inferior from an
algorithmic standpoint, only from an implementation standpoint and a
flexibility standpoint.
We can hardly be proud of our VFS/BIO layering which has been so buggy
these last few years. The types of bugs I'm finding in FreeBSD have
nothing to do with the algorithms and everything to do with the code
being uncommented and virtually unreadable due to the hundreds of
badly thought out optimizations and other hacks that have obscured the
core implementation.
When I say clean, I mean 'readable, obvious, and functionallty layered'.
I had no trouble following the linux code even going deep into the paging
and VFS subsystems. Following FreeBSD code has been like pulling nails.
It's why we are *still* finding bugs in our VM system, after years of work.
FreeBSD's VM system is definitely more flexible and efficient. Given the
choice, I would much rather keep FreeBSD's VM system. That flexibility
has come at the cost of dirtying up the code considerably, though. What
use is flexibility if every new feature brings half a dozen bugs to light
and creates half a dozen more of its own?
My current work is to keep the flexibility while cleaning up the code.
If we can clean up the code, we will have a clean, flexible, AND kickass
VM system rather then simply a kickass VM system.
:
:What's my definition of clean then? For example, common operations
:shouldn't need to resort to brute-forceish approaches (not in many
:cases, anyhow). Which reminds me, your swp_pager_meta_free_all looks
:a bit frightening...do you intend to keep it like it is?
'efficiency'. As I stated, Linux's VM code is not terribly efficient.
I would disagree with the 'brute force' line, though. They've stuck
to their guns pretty well and the core concepts are sound. Linux simply
has not had the long operational history that BSD has and they are having
to relearn many of the same lessons.
It should be noted that linux can still implement inode-based object
layering underneath their existing VM system. Their direct use of
pagetables for bookkeeping does not prevent that.
:In addition to the problems you stated, as far as I can tell, swap
:backing is not shared for copy-on-write associations (copy-on-write
:pages get swapped out multiple times, all but the last don't free any
:memory) unless the page was swapped out when the maps were copied, in
:which case it ends up copy-on-access...maybe, I'm not sure whether the
:swap cache eliminates this.
:
:This (and many of the things you pointed out) is due to the simplistic
:approach where pages don't really have an identity (only mappings)
:unless they are backed by an inode. Which is perhaps at the core of
:most of the algorithmic differences between Mach/4.4BSD and Linux VM
:systems.
:
:IMHO pages need to have an identity even when they are not associated
:with files (based on a quick glance, NetBSD's UVM seems to retain
:this property while optimizing the management of anonymous pages. I'm
:not convinced in terms of the choice of data structures for the anon
:maps in UVM, though).
Pages under linux *DO* have an identity, but you have to look it up
in the meta objects backing the page tables based on the position of the
page in the page table. They do not implement swap as a paging layer as
we do, but then again our implementation of swap as a paging layer is
a mostly degenerate case in our vm_object layering system so it amounts
to pretty much the same thing.
I don't think COW pages get swapped multiple times, but I could be wrong.
My read is that when a linux process forks, the swap block associates are
shared even for COW pages. The COWed pages are marked read-only and
split if a write fault occurs. Unless it's writing the same shared
page from different processes to the same swap block over and over again,
that is. It shouldn't have to - I was under the impression that the
swap had a bunch of per-swap-block flags to keep track of the clean/dirty
state, so once one process swaps out a page, the others may scan it but
will not redundantly swap it out.
:> reference information. Instead, it uses the vm_object and pmap modules.
:> I actually like this feature of FreeBSD. A lot.
:
:Additionally, the way FreeBSD does things has better potential for
:concurrency (even though the locks have been ripped out) compared to
:Linux.
I disagree. FreeBSD still must hold locks through pmap changes and those
scan all related processes, just as linux does. The difference is that
since FreeBSD can delete page tables, it generally winds up scanning many
FEWER processes to change the pmap state for a page then linux. Linux
must scan/adjust the pmap state for e very process mmap()ing the page
whether or not it is using the page.
:> Linux demarks interrupts from supervisor code much better then we do.
:
:You seem to consider simpler to mean cleaner/better. Although in this
:case, I'd agree that much of the complexity of FreeBSD is unnecessary.
:
My philosophy is, in general, that (1) one must separate the algorithm
from the implementation and that (2) any algorithm can be cleanly
implemented. If it isn't, it should be rewritten. If the programmer
can't reimplement it, either the programmer is unworthy of the algorithm
or the programmer isn't experienced enough to do it right, or the
algorithm is bad. There is no middle ground in my world view.
-Matt
Matthew Dillon
<dillon@backplane.com>
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901222036.MAA56617>
