Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Dec 2001 18:52:34 -0600
From:      Alfred Perlstein <bright@mu.org>
To:        Bosko Milekic <bmilekic@technokratis.com>
Cc:        Luigi Rizzo <luigi@FreeBSD.org>, cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/kern uipc_mbuf.c
Message-ID:  <20011204185234.C92148@elvis.mu.org>
In-Reply-To: <20011204193149.A12474@technokratis.com>; from bmilekic@technokratis.com on Tue, Dec 04, 2001 at 07:31:49PM -0500
References:  <20011203184737.D48755@iguana.aciri.org> <20011203222303.A2690@technokratis.com> <20011203205623.B49974@iguana.aciri.org> <20011204090509.A8591@technokratis.com> <20011204083922.B54383@iguana.aciri.org> <3C0D14AC.412D674C@dsuper.net> <20011204123627.K92148@elvis.mu.org> <20011204180633.A11305@technokratis.com> <20011204175837.X92148@elvis.mu.org> <20011204193149.A12474@technokratis.com>

next in thread | previous in thread | raw e-mail | index | archive | help
* Bosko Milekic <bmilekic@technokratis.com> [011204 18:31] wrote:
> 
>   That's ridiculous.
> 
>   My post had nothing to do with that, or the
> usage of __FILE__ or whatever other GCCism.

I never said it did, the only thing that I noticed was that
you had brought up people blaming the mbuf subsystem for what
may have been a driver failure/error/leak.

> What I *WAS* suggesting was that if we ARE going to rate-limit
> "warning" printf()s that we should do it at the RIGHT places.
> subr_mbuf.c is not the right place because the only CAUSE of failure
> that can be FIXED with regards to the mbuf subsystem is the increase of
> NMBCLUSTERS and, as I've explained so many times before, this case is
> already accounted for in vm/vm_kern.c - so that leaves us with several
> other possibilities which I suggested covering in appropriate places, if
> we were really bent on that, such as the malloc() code, or whatever.

Ah, I see, this is pretty late in the game though, perhaps we
could add a single output when it hits something like 80%.

> Again, I never suggested or implied that we should be inventing ways to
> determine the cause of the failure from within the mbuf code.

I assumed you were concerned, because to quote:
    In the case of (3), it's as you previously mentionned: you're screwed
    anyway. You're out of physical memory. But, even more importantly, the
    reason the message doesn't belong in subr_mbuf.c is similar to (2),
    it's not an mb_alloc() problem and it's not fixable with an increase
    in NMBCLUSTERS. So, you see, I'm going to eventually wind up with
    people having all sorts of wierd problems due to lack of physical
    memory or lack of address space for malloc(9) to allocate from (the
    latter is, admittedly less likely), and they're going to see these
    messages about increasing NMBCLUSTERS or mbuf failures, and run off
    screaming "it's the mbuf allocator, it's the mbuf allocator!" like
    headless chicken, and wondering why them steadily increasing
    NMBCLUSTERS is having no effect.

> > I think I would make sense to change the printf to output once or
> > maybe every so often "mbuf utilization at 80% suggest increasing
> > NMBCLUSTERS" rather than waiting for the inevitable explosion
> > when we hit 100%.
> 
>   The best way to do this, by far, is as Luigi suggested, from userland.
> A daemon can periodically calculate mbuf map usage and gather data. I
> had something like this setup a while back and even had graphs generated
> realtime with mrtg. This way even if you do run out of mbufs and/or
> clusters, you'll always have the graph to judge whether or not you need
> to increase NMBCLUSTERS.

Consider these points:

.) Such a userland daemon would could be reliant on so many kernel
   subsystems working properly for it to work properly, an mbuf shortage
   would definetly make the system behave strangly possibly causing
   missed log messages.
.) It (the daemon) would either have to poll the mbuf stats which means
     o It may poll too infrequently to be of any use for early
       detection of a problem.
     o It may poll too often thereby increasing the load on the
       system for no reason
.) A simple printf is on an order of one thousandth the amount of
   complexity of such a daemon no matter how trivial it winds up
   being.

-- 
-Alfred Perlstein [alfred@freebsd.org]
'Instead of asking why a piece of software is using "1970s technology,"
 start asking why software is ignoring 30 years of accumulated wisdom.'
                           http://www.morons.org/rants/gpl-harmful.php3

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011204185234.C92148>