From owner-freebsd-current Thu Sep 24 19:47:58 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id TAA11844 for freebsd-current-outgoing; Thu, 24 Sep 1998 19:47:58 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from time.cdrom.com (time.cdrom.com [204.216.27.226]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id TAA11832 for ; Thu, 24 Sep 1998 19:47:54 -0700 (PDT) (envelope-from jkh@time.cdrom.com) Received: from time.cdrom.com (jkh@localhost.cdrom.com [127.0.0.1]) by time.cdrom.com (8.8.8/8.8.8) with ESMTP id TAA22920; Thu, 24 Sep 1998 19:48:41 -0700 (PDT) (envelope-from jkh@time.cdrom.com) To: Kris Kennaway cc: current@FreeBSD.ORG Subject: Re: VM out-of-swap problems In-reply-to: Your message of "Fri, 25 Sep 1998 11:26:34 +0930." Date: Thu, 24 Sep 1998 19:48:41 -0700 Message-ID: <22915.906691721@time.cdrom.com> From: "Jordan K. Hubbard" Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > Is anyone actively working on the problems related to daemons dying after the > system has used a large amount of swap (the old "inetd() in realloc: junk > pointer" thing)? It seems to me that with the release just around the corner > this is something which should be killed now, as it seems to be an easily It seems that way to more folks than just you, but rest assured that if it were that simple to get bugs of long-standing evilness eradicated just before a major release, we'd make it a policy to do so without a second thought. Unfortunately, this one is just a bit harder to debug and lots of folks have tried. You're MORE than welcome to try your hand at it, in fact, since we're definitely well into the stage where general debugging assistance on this specific problem is being eagerly solicited. It certainly wouldn't hurt. > If no-one has the time to look at this in their spare time, may I suggest to > Jordan that he look at paying one of the VM experts (John Dyson?) to fix this > problem before 3.0 rolls out the door? I've also learned the hard way (and I have the bruises to show for it) that throwing money at problems only occasionally gets them fixed, my former belief being that money (in sufficient quantities) would always provide a solution somehow. Silly me. :-( I honestly don't currently know of any VM experts for hire (JD included) who would have any reasonable assurance of fixing this problem, whether they were paid to do so or not, in the time-frame we have for the release. That's a bummer, obviously, and I surely wish it were otherwise, but I have to live within the constraints we have. Mulitiple folks have hunted for this one, believe me, and if I gave them $$$ they'd probably just say "thanks for the $$$, but I still can't find it." What it really needs (I feel) is a fresh set of eyes on it by someone who's also observing the phenomenon repeatedly enough to be able to collect some truly decent stats (malloc and otherwise) on it. AFAIK, no one has yet to even do that much - they just report the oft-quoted message from malloc but don't have any information about the general program control flow which led up to this happening, and the "VM experts" tend to be busy enough that they're not going to be able to make much progress with nothing more that a "junk pointer" printf(). This problem needs to be instrumented like a lab experiment and far more data gathered first. > I am quite willing to run debugging patches if someone provides them to track I don't see anyone raising their hands to do the instrumentation ("debugging patches"), myself, but given that the very process of adding instrumentation is a good part of the debugging exercise in any case, I'd tend to suggest that whomever collects the traces should also be the one to decide where they go and what they should print. - Jordan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message