Date: Sat, 23 Jan 1999 00:15:13 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: dyson@iquest.net Cc: tlambert@primenet.com, dillon@apollo.backplane.com, hackers@FreeBSD.ORG Subject: Re: Error in vm_fault change Message-ID: <199901230015.RAA13495@usr09.primenet.com> In-Reply-To: <199901222353.SAA36870@y.dyson.net> from "John S. Dyson" at Jan 22, 99 06:53:18 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> Actually, the RSS code has been in the kernel for about 3yrs now, and > is well understood. If it was being written from scratch, I would be > more likely to agree with you. The kernel RSS limiting code works mostly > for private data in the process. Nevertheless, an RSS based implementation is still going to fail to address the major issue for a public server platform. Matt's objects are grounded in the idea that one or two users can stage a DOS attack against other users of the system, and, in general, if you allow semi-literate people to run code from the net, this *will* happen, even without evil intent. > > What I suggest is that vnodes with more than a certain number of > > pages associated with them be forced to steal pages from their > > own usage, instead of obtaining them from the system page pool. > > Vnodes aren't the only structure that contains data -- maybe you > mean vm objects also. In fact, vnode or "shared" data isn't usually > the problem with memory usage. However a vnode quota is probably > a good idea also. I think that the general case of the quota is on the object_t, but the enforcement of the quota is in the page case. One of the obvious reasons for wanting a VM object alias is to allow direct read-only or COW mapping of of pages already in core as part of an MFS object backed by core pages and/or swap. I don't argue the (de)merits (IMO) of trying to solve what I think is the wrong problem. But I will say that the enforcement of the quota on all objects is highly problematic, and that there is already a very nice chokepoint presenting itself for our (ab)use, and that is the vnode pager. I think it would be very hard to implement general VM object quota enforcement, and I think that it's the wrong thing to do in any case (except maybe as a means of global policy enforcement). I also think that it moves away from the idea of implementing some form of per process working set quota, and it disallows special cases for "well behaved code that nevertheless needs a lot of pages, just as if it were really badly behaved code". > > In general, when we talk about badly behaved processes, we are > > talking about processes with large working sets that are directly > > mapped to vnode backing objects. > > Not necessarily, think the new versions of GNU C++ :-). I'd argue that creating a lot of dirty data pages that are being used is not a bad behaviour; maybe it's bad compiler architecture, but that's another issue. The specific bad behaviour that I'd like to enforce is file I/O based, and has to do with intentional thrashing by a process, either because it's out of control (maybe the idiot OS doesn't propagate SIGHUP to groups, like all other OS's or something), or because it's badly designed in such a way that it thrashes a file. Basically, this would mean that if the page at the end of the LRU that's going to be forced out is dirty, so be it. It will increase swapping for that process, but reduce swapping overally by 2 times as much, in that it won't force pages that another process is about to use out of core. Think of paging out as positive caching and avoiding it as negative caching. Negative caching is 2 times better than positive caching, for most applications where the hash space doesn't bloat out of control. In the vm_object_t case, the hash space is (relatively) fixed, so it's a major win. > > This soloution was tried, and worked very well, in a UnixWare 2.0 > > kernel > > No UnixWare kernel VM ever worked very well, did it? Actually, Steve Baumel, the architect of the SVR4.2 (UnixWare 2.x) SMP aware VM system did one hell of a job. He addressed most of the real issues, including the ability to autogrow thread stacks, very early on in the game. It's very much a shame that other groups within USL failed to utilize his code. And that their participation on standards committies were such that we ended up with standards where the stack is passed to the thread at creation time, instead of the creation interface being responsible for the stack creation. Basically, don't blame Steve's design for the bad implementation of various parts of SVR4.2. It's a matter of source tree organization that prevented the working set quota going into the SVR4.2 source tree. Just as the FreeBSD source tree is rather badly organized for seperating architectural pieces, the USL source tree was nearly impossible to get cross-subsystem changes integrated into. Each compilable developement source tree was actually a combination of three seperate source repositories (one for the kernel peieces from third parties, one for the system independent code, and one for the architecture specific code), and control of the repositories was decentralized, as was control of the interfaces. Organizationally, I'd say SVR4.2 was damn lucky to even have a mechanism like user defined scheduling classes that could be abused to address the problem at all. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901230015.RAA13495>