Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Sep 1998 19:48:41 -0700
From:      "Jordan K. Hubbard" <jkh@time.cdrom.com>
To:        Kris Kennaway <kkennawa@physics.adelaide.edu.au>
Cc:        current@FreeBSD.ORG
Subject:   Re: VM out-of-swap problems 
Message-ID:  <22915.906691721@time.cdrom.com>
In-Reply-To: Your message of "Fri, 25 Sep 1998 11:26:34 %2B0930." <Pine.OSF.4.03.9809251119030.290-100000@mercury.physics.adelaide.edu.au> 

next in thread | previous in thread | raw e-mail | index | archive | help
> Is anyone actively working on the problems related to daemons dying after the
> system has used a large amount of swap (the old "inetd() in realloc: junk
> pointer" thing)? It seems to me that with the release just around the corner
> this is something which should be killed now, as it seems to be an easily

It seems that way to more folks than just you, but rest assured that
if it were that simple to get bugs of long-standing evilness
eradicated just before a major release, we'd make it a policy to do so
without a second thought.  Unfortunately, this one is just a bit
harder to debug and lots of folks have tried.  You're MORE than
welcome to try your hand at it, in fact, since we're definitely well
into the stage where general debugging assistance on this specific
problem is being eagerly solicited.  It certainly wouldn't hurt.

> If no-one has the time to look at this in their spare time, may I suggest to
> Jordan that he look at paying one of the VM experts (John Dyson?) to fix this
> problem before 3.0 rolls out the door?

I've also learned the hard way (and I have the bruises to show for it)
that throwing money at problems only occasionally gets them fixed, my
former belief being that money (in sufficient quantities) would always
provide a solution somehow.  Silly me. :-(

I honestly don't currently know of any VM experts for hire (JD
included) who would have any reasonable assurance of fixing this
problem, whether they were paid to do so or not, in the time-frame we
have for the release.  That's a bummer, obviously, and I surely wish
it were otherwise, but I have to live within the constraints we have.

Mulitiple folks have hunted for this one, believe me, and if I gave
them $$$ they'd probably just say "thanks for the $$$, but I still
can't find it."  What it really needs (I feel) is a fresh set of eyes
on it by someone who's also observing the phenomenon repeatedly enough
to be able to collect some truly decent stats (malloc and otherwise)
on it.  AFAIK, no one has yet to even do that much - they just report
the oft-quoted message from malloc but don't have any information
about the general program control flow which led up to this happening,
and the "VM experts" tend to be busy enough that they're not going to
be able to make much progress with nothing more that a "junk pointer"
printf().  This problem needs to be instrumented like a lab experiment
and far more data gathered first.

> I am quite willing to run debugging patches if someone provides them to track

I don't see anyone raising their hands to do the instrumentation
("debugging patches"), myself, but given that the very process of
adding instrumentation is a good part of the debugging exercise in any
case, I'd tend to suggest that whomever collects the traces should
also be the one to decide where they go and what they should print.

- Jordan

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?22915.906691721>