Date: Fri, 22 Jan 1999 12:36:57 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Ville-Pertti Keinonen <will@iki.fi> Cc: hackers@FreeBSD.ORG Subject: Re: Review and report of linux kernel VM Message-ID: <199901222036.MAA56617@apollo.backplane.com> References: <199901140720.XAA22609@apollo.backplane.com> <8690evpkc4.fsf@not.oeno.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:Whaat? : :You appear to be confusing cleanliness (as I understand it, and I'm :afraid that many other readers of your review might understand it) :with simplicity. : :I would claim the exact opposite. The Linux VM system is simpler, but :far *less* clean because of the very inflexible (almost non-existent) :"layers". Not to mention the code, which shares the (IMHO) poor :source organization and apparently arbitrary dependencies of Linux as :a whole. The Linux VM system implements all the core features that the FreeBSD VM system implements, just not as efficiently. Its use of a page table paradigm to do VM-specific object layering is really not that bad of an idea. It *does* lock them into a more ridgid scheme ( for example, the linux scheme starts to break down when you share huge objects between processes ), but so far they've been able to implement the same core feature set that we have in our VM system. Thus, it is not possible to argue that their system is inferior from an algorithmic standpoint, only from an implementation standpoint and a flexibility standpoint. We can hardly be proud of our VFS/BIO layering which has been so buggy these last few years. The types of bugs I'm finding in FreeBSD have nothing to do with the algorithms and everything to do with the code being uncommented and virtually unreadable due to the hundreds of badly thought out optimizations and other hacks that have obscured the core implementation. When I say clean, I mean 'readable, obvious, and functionallty layered'. I had no trouble following the linux code even going deep into the paging and VFS subsystems. Following FreeBSD code has been like pulling nails. It's why we are *still* finding bugs in our VM system, after years of work. FreeBSD's VM system is definitely more flexible and efficient. Given the choice, I would much rather keep FreeBSD's VM system. That flexibility has come at the cost of dirtying up the code considerably, though. What use is flexibility if every new feature brings half a dozen bugs to light and creates half a dozen more of its own? My current work is to keep the flexibility while cleaning up the code. If we can clean up the code, we will have a clean, flexible, AND kickass VM system rather then simply a kickass VM system. : :What's my definition of clean then? For example, common operations :shouldn't need to resort to brute-forceish approaches (not in many :cases, anyhow). Which reminds me, your swp_pager_meta_free_all looks :a bit frightening...do you intend to keep it like it is? 'efficiency'. As I stated, Linux's VM code is not terribly efficient. I would disagree with the 'brute force' line, though. They've stuck to their guns pretty well and the core concepts are sound. Linux simply has not had the long operational history that BSD has and they are having to relearn many of the same lessons. It should be noted that linux can still implement inode-based object layering underneath their existing VM system. Their direct use of pagetables for bookkeeping does not prevent that. :In addition to the problems you stated, as far as I can tell, swap :backing is not shared for copy-on-write associations (copy-on-write :pages get swapped out multiple times, all but the last don't free any :memory) unless the page was swapped out when the maps were copied, in :which case it ends up copy-on-access...maybe, I'm not sure whether the :swap cache eliminates this. : :This (and many of the things you pointed out) is due to the simplistic :approach where pages don't really have an identity (only mappings) :unless they are backed by an inode. Which is perhaps at the core of :most of the algorithmic differences between Mach/4.4BSD and Linux VM :systems. : :IMHO pages need to have an identity even when they are not associated :with files (based on a quick glance, NetBSD's UVM seems to retain :this property while optimizing the management of anonymous pages. I'm :not convinced in terms of the choice of data structures for the anon :maps in UVM, though). Pages under linux *DO* have an identity, but you have to look it up in the meta objects backing the page tables based on the position of the page in the page table. They do not implement swap as a paging layer as we do, but then again our implementation of swap as a paging layer is a mostly degenerate case in our vm_object layering system so it amounts to pretty much the same thing. I don't think COW pages get swapped multiple times, but I could be wrong. My read is that when a linux process forks, the swap block associates are shared even for COW pages. The COWed pages are marked read-only and split if a write fault occurs. Unless it's writing the same shared page from different processes to the same swap block over and over again, that is. It shouldn't have to - I was under the impression that the swap had a bunch of per-swap-block flags to keep track of the clean/dirty state, so once one process swaps out a page, the others may scan it but will not redundantly swap it out. :> reference information. Instead, it uses the vm_object and pmap modules. :> I actually like this feature of FreeBSD. A lot. : :Additionally, the way FreeBSD does things has better potential for :concurrency (even though the locks have been ripped out) compared to :Linux. I disagree. FreeBSD still must hold locks through pmap changes and those scan all related processes, just as linux does. The difference is that since FreeBSD can delete page tables, it generally winds up scanning many FEWER processes to change the pmap state for a page then linux. Linux must scan/adjust the pmap state for e very process mmap()ing the page whether or not it is using the page. :> Linux demarks interrupts from supervisor code much better then we do. : :You seem to consider simpler to mean cleaner/better. Although in this :case, I'd agree that much of the complexity of FreeBSD is unnecessary. : My philosophy is, in general, that (1) one must separate the algorithm from the implementation and that (2) any algorithm can be cleanly implemented. If it isn't, it should be rewritten. If the programmer can't reimplement it, either the programmer is unworthy of the algorithm or the programmer isn't experienced enough to do it right, or the algorithm is bad. There is no middle ground in my world view. -Matt Matthew Dillon <dillon@backplane.com> To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901222036.MAA56617>