From owner-freebsd-hackers Tue Jul 13 16: 2:11 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from mail.netbsd.org (redmail.netbsd.org [155.53.200.193]) by hub.freebsd.org (Postfix) with SMTP id 45356152CC for ; Tue, 13 Jul 1999 16:02:00 -0700 (PDT) (envelope-from cgd@netbsd.org) Received: (qmail 17606 invoked by uid 1000); 13 Jul 1999 23:01:48 -0000 To: Matthew Dillon Cc: Jason Thorpe , "Brian F. Feldman" , Noriyuki Soda , bright@rush.net, dcs@newsguy.com, freebsd-hackers@FreeBSD.ORG, jon@oaktree.co.uk, tech-userlevel@netbsd.org Subject: Re: Replacement for grep(1) (part 2) References: <199907132110.OAA23817@lestat.nas.nasa.gov> <199907132114.OAA80781@apollo.backplane.com> <877lo4z0pe.fsf@redmail.redback.com> <199907132212.PAA81234@apollo.backplane.com> From: cgd@netbsd.org (Chris G. Demetriou) Date: 13 Jul 1999 16:01:47 -0700 In-Reply-To: Matthew Dillon's message of Tue, 13 Jul 1999 15:12:14 -0700 (PDT) Message-ID: <871zecyx0k.fsf@redmail.redback.com> Lines: 211 X-Mailer: Gnus v5.5/Emacs 20.2 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Matthew Dillon writes: > The text size of a program is irrelevant, because swap is never > allocated for it. The data and BSS are only relevant when they > are modified. > > The only thing swap is ever used for is the dynamic allocation of memory. > There are three ways to do it: sbrk(), mmap(... MAP_ANON), or > mmap(... MAP_PRIVATE). yup, almost: not all MAP_PRIVATE mappings need backing store, only MAP_PRIVATE and writeable mappings. (MAP_PRIVATE does _not_ guarantee that you won't see modifications made via other MAP_SHARED mappings.) > Dynamic allocation of memory can occur under a huge number of > conditions. The actual physical allocation of dynamic memory - what is > known as a copy-on-write - cannot be predicted from the potential > allocation of memory. The most obvious example of this is a fork(). yup. > There is a lot of hidden 'potential' VM that you haven't considered. > For example, if the resource limit for a process's stack is 8MB, then > the process can potentially allocate 8MB of stack even though it may > actually only allocate 32K of stack. Yes, this is a good example. In general, however, i believe it's safe to say that it's eaiser to constrain stack usage than heap usage. Many large consumers of heap usage are bad style to begin with ('real' embedded environments typically have very limited stacks), and mechanisms such as alloca() can be frobbed to use heap allocations (with some run-time cost). > When a process forks, the child > process can potentially modify just about every single non-text page that > was owned by the parent process, causing a copy-on-write to occur. > The dynamic potential can run into the megabytes but most child processes > only modify a small percentage of those pages. ... and, well written applications which are just going to execve() _should_ use vfork() for this case. If they use fork(), they want the ability to COW the entire address space, and should be charged for that data. > :Nowhere did I see what amounts to anything other than hand-waving > :claims that you'll have to allocate much, much more backing store than > :you currently need to, and claims that that's unacceptable for general > :purpose computing environments. If you have a more specific analysis > :that you'd like me/us to read, please point us at it more specifically. > > You are welcome to bring up real-life situations as examples. That > is what I do. But look who is doing the hand-waving now? Huh? You've made the claim that non-overcommit is useless and should not be implemented because it requires for normal workloads 8 or more times the actual backing store usage. You have yet to justify it. What workload are you talking about? What systems? You start to allude to this later, but you simply say "SGIs." > :* while you certainly need to allocate more backing store than you > :would with overcommit, it's _not_ ridiculously more for most > :applications(+), and, finally, > > Based on what? I am basing my information on the VM reservation made > by programs, and I am demonstrating specific points showing how those > reservations occur. For example, the default 8MB resource limit for > the stack segment. Actually, only now have you brought that up. And, that's very system dependent. On NetBSD/i386 the default is 2MB, and, it's worth noting that you only need to reserve as much as the current stack limit allows (after that, you're going to get a signal anyway, and if more reservations need to be done they can be done on a page-by-page basis and if it fails you deliver a signal and if it still fails deliver a nastier signal). Stack limits are pretty much the one odd case (and they're already handled oddly by the VM code.) > :* even if you are not willing to pay that price, there _are_ people > :who are quite willing to pay that price to get the benefits that they > :see (whether it's a matter of perception or not, from their > :perspective they may as well be real) of such a scheme. > > Quite true. In the embedded world we preallocate memory and shape > the programs to what is available in the system. But if we run out > of memory we usually panic and reboot - because the code is designed > to NOT run out of memory and thus running out of memory is a catastrophic > situation. There's a whole spectrum of embedded devices, and applications that run on them. That definition works for some of them, but definitely not all. A controller in my car's engine behaves as you describe. That's one type of embedded system, and can have very well defined memory requirements. There, if you're out of memory, you have a Problem. A web server appliance is another type of embedded system. Its memory requirements are quantifiable, but there's much more parameterization necessation. (# of clients being served vs. # of management sessions open vs. background tasks that may need to be run periodically, for instance.) Basically, for this type of thing you need various types of memory reservation and accounting for the various functions, and indeed, it's not best done (entirely) with an no-overcommit-based or resource-limit-based strategy. However, for 'background tasks' that might involve user-supplied code or that might have highly variable memory requirements, devoting some set of the memory to be managed by a no-overcommit or resource-limit strategy may be the right thing. (and i'd probably prefer the latter.) However, a web browser on the front of your microwave or on a handheld tablet is also a type of embedded system, it's an 'appliance.' The user should never have to worry about it rebooting, or hanging, or killing the program that they're looking at. (It may be reasonable to tell them that they're doing too much, and that they therefore need to shut something done, or prevent them from starting up new tasks when they're too close to a system limit.) In this type of world, memory needs are just too varied to control via blanket resource limits. Further, an applications needs may be sufficiently variable that they can't even be reasonably precomputed. You're faced with two choices: punt, and don't let the user exploit their system to its potential, or un-"embed" parts of it an expose memory allocation limits to the user (and do admission control based on them), like e.g. macintoshes used to do. In the latter two cases, no-overcommit and the proper "committed memory" accounting that comes with it is a very useful, perhaps critical, tool. (Note that the difference between a no-overcommit policy and between correct tracking of committed memory in a proper design is simply turning a boolean switch: you should always be tracking committed memory, and if you turn the switch you enable/disable overcommit.) (From personal experience, I've built 'embedded' systems in the latter two categories, non in the former. Ones like the latter have _seriously_ suffered from an inability to disallow overcommit.) > It is not appropriate to test every single allocation for a failure... > that results in bulky code and provides no real benefit since the > code is already designed to not overcommit the embedded system's memory. > If it does, you reboot. The most critical embedded systems, such as > those used in satellites, avoid dynamic allocation as much as possible > in order to guarentee that no memory failures occur. This works under > UNIX as well. If you run your programs and touch all your potentially > modifiable pages you have effectively reserved the memory. Yup. however, even in thise case, having to manually touch the pages is a wasteful detail that the programmer should not have to think about. resource reservation should take care of it for them. 8-) (the "wastefulness" there: (1) the programmer has to think about "effectively reserving" the memory, (2) the actual code to "effectively reserve" the memory in the application, and (3) any side effects the system may suffer because of this, e.g. immediate and unnecessary paging of a whole bunch of data, if it's a workstation/server system.) Certainly, it's better for many applications to allocate all the memory that they'll use in advance. But that's orthogonal to the issue of resource reservation, except inasmuch as kludges are necessary to get proper resource reservations. 8-) > :I would honestly love to know: what do you see huge numbers of > :reserved pages being reserved for, if they're not actually being > :committed, by 'average' UNIX applications (for any definition of > :average that excludes applications which do memory based computation > :on sparse dasta). > > Stack, hidden VM objects after a multi-way fork, MAP_PRIVATE memory maps, > private memory pools (e.g. for web servers, java VM's, and so forth). And of those, for properly written applications (i.e. those that use vfork() correctly), the _only_ actual difference between the pages that are considered committed and those which are actually committed are: * stack. Yes, this can be a significant, but manageable, source of difference. * writable private memory which hasn't yet been written (e.g. data, bss) which compared to many/most applications dynamic allocations is insignificant. * intentional sparse data allocation, which is the reason that any general-purpose OS which supports a no-overcommit policy _must_ be capable of supporting an overcommit-allowed policy for some peoples' usage. Note that I've never said that there aren't situations in which allowing overcommit is correct. In some situations, it's necessary. However, you've claimed that nobody should _ever_ have the ability to prevent overcommit, and that's simply unreasonable in certain situations. cgd -- Chris Demetriou - cgd@netbsd.org - http://www.netbsd.org/People/Pages/cgd.html Disclaimer: Not speaking for NetBSD, just expressing my own opinion. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message