From owner-freebsd-hackers Wed Jul 14 13:45:52 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from aussie.cs.mu.OZ.AU (dhcp2966.ietf.uninett.no [128.39.29.66]) by hub.freebsd.org (Postfix) with ESMTP id ECFF41541C for ; Wed, 14 Jul 1999 13:45:47 -0700 (PDT) (envelope-from kre@cs.mu.OZ.AU) Received: from cs.mu.OZ.AU (localhost [127.0.0.1]) by aussie.cs.mu.OZ.AU (8.8.8/8.8.8) with ESMTP id WAA20353; Wed, 14 Jul 1999 22:35:52 +0200 (CEST) From: Robert Elz To: "Daniel C. Sobral" Cc: freebsd-hackers@FreeBSD.ORG, tech-userlevel@netbsd.org Subject: Re: Replacement for grep(1) (part 2) In-reply-to: Your message of "Thu, 15 Jul 1999 00:53:17 +0900." <378CB26D.C0BC0DBE@newsguy.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 14 Jul 1999 22:35:51 +0200 Message-ID: <20351.931984551@cs.mu.OZ.AU> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Date: Thu, 15 Jul 1999 00:53:17 +0900 From: "Daniel C. Sobral" Message-ID: <378CB26D.C0BC0DBE@newsguy.com> | Would you care to name such systems? munnari was one (the system of the From: header, even though this mail isn't actually going anywhere near it). I will describe it a bit lower down. | And, btw, a system consuming | all memory is *not* necessarily approaching paging death. No, of course not, though I didn't say all memory, I said all VM. And while it is possible to have all VM consumed, and no paging activity at all, that would tend to indicate insufficient VM allocated (reaching an artificial barrier). | More | likely, it is just storing a lot of data in the swap which will | never be used (which is the whole point of overcommit in first | place), and, thus, never paged in. The systems I describe were not using overcommit, further, I wouldn't imagine that a system storing anything to swap would be overcommiting - as I understand the term, overcommit only relates to allocating VM resources which aren't backed by anything physical at all ("here's all this address space you can play in if you like, but you had better not actually do that, because if you do it won't work"). Either applied to one process, as that wording suggests, or aggregated over the whole system. If a process was (for some stupid reason) loading a whole bunch of data into the swap space, that would be committed VM, and you have to have the resources to cope with it. Now to munnari. It no longer runs quite like this, but munnari is an alpha, 128MB, runs digital unix (not in overcommit mode, either is possible there). At the time of which I speak it ran two principal applications of note, innd with a VM footprint about 100MB, and named, with a memory footprint (at the time) of about 90MB (as it is now, it no longer runs innd, but its named has grown to > 120MB). It also ran a bunch of small stuff (sendmail, typically 1 or 2 instances, around 3MB each), ftpd (smaller, most often 0 or 1, sometimes 3 or 4,) and the occasional shell (a few hundreds of MB) plus init getty cron syslog and all that associated noise with mem requirements approaching 0. That's fine. Well, not really fine, innd and named would fight each other all day for who had how much of the real memory, and who was relegated to swap, of which there was enough for all this to fit, but not a lot more than that (enough for one of them to fork when it needed to, that's all - not both at once, and yes, overcommit would have allowed both at once, but that was not an aim). Then, because it was running innd, it was also running the perl script that summarises the log file, that could grow to 30MB, maybe more. And because it is running sendmail, every now and then you get the typical sendmail huge queue syndrome (at least for old sendmails, which this was), where you get a dead site, a large queue of processes, and a bunch of sendmails running the queue, spending most of their time hung on connection attempts that aren't working, and gradually growing bigger (maybe 8 or 10 processes at 15Mb each). Somewhere amongst all of this swap would run out, and a good thing too, as by this time the system really would be paging itself to oblivion. Note that all this (large) VM I have described was filled with real data (except for the odd times hen innd or named had just forked), none of it could be overcommitted and just ignored. Whatever policy was in place, the physical VM resources would have run out. Now let's look at what happens with the two methods. With all VM backed by real mem or swap space, processes go about allocating memory - when there is no more left, the allocations start failing. If the process is perl, it just collapses in a heap, and the log file summary doesn't get made that day. So sad... If its sendmail, it issues "OS error, temporary failure" type responses, saves its queue files, and exits. A later sendmail will deliver those messages, no harm. If its a shell, who knows (I forget what the shells do, I think most just keep trying, at least if interactive), but they consume mem at such a slow rate it doesn't matter - fork() would typically fail though, so no new processes could get started. innd would just pause, and wait till a bit later when mem might be available again (those perls and sendmails all gone away). named just the same (at least the named munnari ran). They're the two processes munnari was supposed to be runinng - those two don't just die. Now, with overcommit mode, we get an extra 30 seconds of life, because no doubt there are a few pages floating around that have been allocated to some process, but nothing has bothered to write into yet. An extra 30 seconds if we're lucky (except if we followed the advice given here earlier which would indicate that only 1/8 the amount of swap space would be needed, in which case these processes would never have gotten started in the first place). After that short grace period, during which the kernel has been happily answering requests for more VM with "sure, have as much as you like", something needs an extra page of real storage, there is none available, and we either deadlock, or die. The approach suggested here seems to find the biggest process (which here would be innd or named) and kill -9 it. No thanks. Not an acceptable answer. Sure it would get lots of VM back again, but the system would no longer have been doing what it was supposed to be doing. Adding more swap space would be easy, but the wrong thing to do, that would just have allowed the system to page itself to death, thrashing into eternity - having processes go away is the only solution to this kind of problem. Except it needs to be the right processes, and "right" does not equal "big", nor any other criteria the kernel could possibly figure out for itself. kre To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message