Date: 24 Jan 1999 11:26:14 -0000 From: Ville-Pertti Keinonen <will@iki.fi> To: dillon@apollo.backplane.com Cc: hackers@FreeBSD.ORG Subject: Re: Review and report of linux kernel VM Message-ID: <19990124112614.2132.qmail@ns.oeno.com> In-Reply-To: <199901222036.MAA56617@apollo.backplane.com> (message from Matthew Dillon on Fri, 22 Jan 1999 12:36:57 -0800 (PST))
next in thread | previous in thread | raw e-mail | index | archive | help
> The Linux VM system implements all the core features that the > FreeBSD VM system implements, just not as efficiently. Its > use of a page table paradigm to do VM-specific object layering > is really not that bad of an idea. It *does* lock them into a more > ridgid scheme ( for example, the linux scheme starts to break down > when you share huge objects between processes ), but so far they've been > able to implement the same core feature set that we have in our VM system. > Thus, it is not possible to argue that their system is inferior from an > algorithmic standpoint, only from an implementation standpoint and a > flexibility standpoint. I don't think that any working implementation is inherently superior or inferior, it all depends on what you consider important, apparently the Linux folks consider it important to avoid adding extra data structures that appear expensive and/or redundant (given that you have page tables anyhow, considering further objects to hold pages probably results in a gut reaction of "that's wasteful!" in most programmers, if further thought is not given to what advantages there could be) at the cost of requiring the code to do more work or having less scalable algorithms. > We can hardly be proud of our VFS/BIO layering which has been so buggy > these last few years. The types of bugs I'm finding in FreeBSD have VFS/BIO are different from VM, although I'd agree with the recent suggestions that perhaps they shouldn't be. > When I say clean, I mean 'readable, obvious, and functionallty layered'. > I had no trouble following the linux code even going deep into the paging > and VFS subsystems. Following FreeBSD code has been like pulling nails. Sure it's easy to understand, it's simple, but I tend to think that a lot of things are done in the wrong place (in terms of layering), data is accessed in inconsistent ways etc. which is why I wouldn't call it "clean". Finding the places where things are actually done is also often difficult. > It's why we are *still* finding bugs in our VM system, after years of work. > FreeBSD's VM system is definitely more flexible and efficient. Given the > choice, I would much rather keep FreeBSD's VM system. That flexibility > has come at the cost of dirtying up the code considerably, though. What > use is flexibility if every new feature brings half a dozen bugs to light > and creates half a dozen more of its own? I find the FreeBSD VM (and other) code readable and moderately well organized (easier to find specific things than in Linux). It isn't hard to understand the code but it can be hard to understand the system as a whole as long as you don't understand the relationships between vm_objects (or even what a vm_object represents), which are not exactly obvious. In terms of bugs, Linux is probably better off because many of the core subsystems are maintained by their original author. And for simpler algorithms, it's easier for someone to take over a subsystem quickly. Even if you do understand how something works, if you didn't write it (or wrote it a sufficiently long time ago) or haven't studied it thoroughly, it's difficult to keep all of the possible implications of a modification in mind. > Pages under linux *DO* have an identity, but you have to look it up > in the meta objects backing the page tables based on the position of the > page in the page table. They do not implement swap as a paging layer as They do have an identity, but not a unique one (until a swap allocation becomes the identity), and as far as I can tell, given a page, you can't find *any* mappings without a brute-force search. > I don't think COW pages get swapped multiple times, but I could be wrong. > My read is that when a linux process forks, the swap block associates are > shared even for COW pages. The COWed pages are marked read-only and Yes, if it was swapped at fork-time. If not, it takes several scans of different processes to get it out of memory. It doesn't take several actual writes because of the swap cache, so it's not quite as bad as it could be. > split if a write fault occurs. Unless it's writing the same shared > page from different processes to the same swap block over and over again, > that is. It shouldn't have to - I was under the impression that the > swap had a bunch of per-swap-block flags to keep track of the clean/dirty > state, so once one process swaps out a page, the others may scan it but > will not redundantly swap it out. I believe that it has only a reference count to allow the shared state to be paged in with the page. Unlike I thought at first, the swap cache permanently maps relationships between physical pages and swap blocks, so no extra copying is done. > :Additionally, the way FreeBSD does things has better potential for > :concurrency (even though the locks have been ripped out) compared to > :Linux. > I disagree. FreeBSD still must hold locks through pmap changes and those > scan all related processes, just as linux does. The difference is that FreeBSD could, for example, potentially service multiple page faults on a vm_map simultaneously (exclusive locks are required for some of the lower-level layers such as vm_objects and the pmap, but those only require short-term locks). In Mach, the VM system actually did do this. > since FreeBSD can delete page tables, it generally winds up scanning many > FEWER processes to change the pmap state for a page then linux. Linux > must scan/adjust the pmap state for e very process mmap()ing the page > whether or not it is using the page. In FreeBSD in order to scan the pmaps of different processes mapping a page, you should only need to lock pmaps, which are a low-level layer and are (or should be) only locked for short periods of time. Of course my concerns seem weird in the context of FreeBSD/Linux because they include multithreading, fine-grained locking, kernel-mode pre-emption and real-time properties, much of which neither system is likely to implement properly in the near future. > My philosophy is, in general, that (1) one must separate the algorithm > from the implementation and that (2) any algorithm can be cleanly > implemented. If it isn't, it should be rewritten. If the programmer I agree, I just don't agree with your interpretations of what's "clean". And seeing linux called "clean" just seemed so completely opposite to how I see it that I couldn't not comment... The FreeBSD VM code isn't totally clean, either, but I would certainly not say that it is less clean than linux. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990124112614.2132.qmail>