Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Mar 2010 13:30:48 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-questions@freebsd.org, Masoom Shaikh <masoom.shaikh@gmail.com>, freebsd-stable@freebsd.org, Ivan Voras <ivoras@freebsd.org>, freebsd-hackers@freebsd.org
Subject:   Re: random FreeBSD panics
Message-ID:  <20100329203048.GA8010@icarus.home.lan>
In-Reply-To: <201003291427.34641.jhb@freebsd.org>
References:  <b10011eb1003280128k4034e667v1377205888e7a2d@mail.gmail.com> <b10011eb1003291001u767b860aybfc95286d6b04ea6@mail.gmail.com> <20100329173038.GA4969@icarus.home.lan> <201003291427.34641.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Mar 29, 2010 at 02:27:34PM -0400, John Baldwin wrote:
> On Monday 29 March 2010 1:30:38 pm Jeremy Chadwick wrote:
> > On Mon, Mar 29, 2010 at 05:01:02PM +0000, Masoom Shaikh wrote:
> > > On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras <ivoras@freebsd.org> wrote:
> > > > On 28 March 2010 16:42, Masoom Shaikh <masoom.shaikh@gmail.com> wrote:
> > > >
> > > >> lets assume if this is h/w problem, then how can other OSes overcome
> > > >> this ? is there a way to make FreeBSD ignore this as well, let it
> > > >> result in reasonable performance penalty.
> > > >
> > > > Very probably, if only we could detect where the problem is.
> > > > Try adding "options     PRINTF_BUFR_SIZE=128" to the kernel
> > > 
> > > this option is already there
> > 
> > The key word in Ivan's phrase is "less mangled".  Neither use of or
> > increasing PRINTF_BUFR_SIZE solves the problem of interspersed console
> > output.  I've been ranting/raving about this problem for years now; it
> > truly looks like a mutex lock issue (or lack of such lock), but I've
> > been told numerous times that isn't the case.
> > 
> > To developers: what incentives would help get this issue well-needed
> > attention?  This problem makes kernel debugging, panic analysis, and
> > other console-oriented viewing basically impossible.
> 
> I was recently going to look at it.  The somewhat drastic approach I was going 
> to take was to add a simple serializing lock around trap_fatal() and a few 
> other places that do similar block prints (e.g. mca_log()).  One of the issues 
> with fixing this in printf itself is that you'd want probably want to 
> serialize complete lines of text on a per-thread basis.  You would want to be 
> able to accumulate this line of text across multiple calls to printf (think of 
> it as line-buffering ala stdio).  However, some folks may be nervous about 
> printf not printing things immediately.
> 
> The other issue is that lots of code assumes it can call printf from anywhere 
> and everywhere.  Mostly this just means that if you add locking and line-
> buffering to printf(9) you have to be very careful to make sure it works in 
> odd places.  Probably a lot of this could be solved by deferring things like 
> trap_fatal() until panic() has already been called (which is bde's preferred
> solution I think).

John,

Thanks for the insights, they're greatly appreciated.

I went looking this morning to see how Linux addressed this issue (if at
all), and it's been discussed a few times in the past.  The longest lkml
thread I could find that mentioned the problem was circa 2002.  Probably
not worth reading as there was work done in 2009 to solve the issue.

http://lkml.indiana.edu/hypermail/linux/kernel/0204.1/index.html#161

Work done by RedHat in 2009 details how they implemented a lockless
version of their kernel ring buffer (similar to our system message
buffer, but probably a lot more complex):

http://lwn.net/Articles/340400/
http://lwn.net/Articles/340443/

Supposedly having multiple writers to the ring is 100% safe; no
interspersed output.  Same goes for interrupt-generated stuff.  There's
some comments in the technical document (2nd link) that imply there's an
individual ring buffer for each CPU; possibly per-CPU kernel message
buffers would solve our issue?

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100329203048.GA8010>