Date: Wed, 21 Jan 2004 15:46:27 +0600 From: Alexey Dokuchaev <danfe@nsu.ru> To: Bill Paul <wpaul@freebsd.org> Cc: Bruce Evans <bde@zeta.org.au> Subject: Re: cvs commit: src/sys/alpha/alpha support.s src/sys/i386/i386 swtch.s src/sys/kern kern_shutdown.c src/sys/sys systm.h Message-ID: <20040121094627.GA44278@regency.nsu.ru> In-Reply-To: <20040120182929.4573C16A4CF@hub.freebsd.org> References: <20040120175334.W3279@gamplex.bde.org> <20040120182929.4573C16A4CF@hub.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 20, 2004 at 10:29:29AM -0800, Bill Paul wrote: > > [abuse of __FILE__ and __LINE__ in panic()] > > > This was rejected in all reviews. It gives less information than > > grepping the sources, at some cost (grep at least gives correct line > > numbers when the sources don't quite match the binary). > > > > Please back this out. > > I have to second this. > > > > Ideally a traceback should be printed too, any takers ? > > > > This can be obtained by running a debugger on the panic dump. Line > > numbers and source file names can also be printed by the debugger > > if the executable has at least line numbers in its debugging info. > > That's fine for developers who always run their systems with crash > dumps enabled and posess sufficient debugger fu to analyze them. What > about ordinary users who encounter a kernel panic due to a trap > and need to report it? It would save wear and tear on everyone's > nerves (_ESPECIALLY_ mine) if they could just send in a traceback > rather than developers being forced to do the usual "go back and > do nm ${KERNEL} and tell us what you see" exchange. It seems that instead of _TRYING_ to simplify everyone's life and being smarter than our users, we need to improve on our users' general bug/panic reporting "culture". Probably some in-depth section in the Handbook would be useful. > > [...] > > > Verbose panic messages, and lots of code to print out values of variables > > just before panicing, are another mistake. Short panic messages were > > good enough when debuggers were primitive or nonexistent. Verbose panic > > messages are even less needed now. > > Ok, hang on just a minute. > > Let's be clear here. Adding linenum/sourcefile info to all panic() > calls is kind of pointless, I agree. (If you're too lazy to grep > the source for the panic message, you're just no damn good.) However, Agreed. > there are times when printing the contents of some variables in > the panic string is _INCREDIBLY_ _USEFUL_. Terse panic messages > are annoying -- for that matter so are terse error messages in general. > Error messages are meant to a) inform the user that they need to > perform some action to correct a problem and b) alert a software > developer to a potential bug and, hopefully, how to fix it. It is > not enough to note that an error occured: you have to explain > _WHY_ it occured. > > Really bad panic message: > > panic("mbuf is corrupt (but I'm not telling you exactly " > "what's corrupt about it because god forbid I should " > "make life easier for you)"); > > Good panic message: > > panic("mbuf is corrupt: mbuf %p has m_len of %d and m_flags of 0x%x " > "but m_data is NULL", m, m->m_len, m->m_flags); > > By showing the contents of the suspicious variable/structure, you give > a developer some clue as to what was going on in the system when the > variable/structure was trashed. Maybe m_len is the size of a protocol > header that points to a particular protocol module being suspect. > Maybe m_flags has M_DONGS set which points to code that Alfred wrote. > It may not make sense to dump out every single piece of available state, > but not providing any state at all is just damn rude. > > Could you determine this from the crash dump? Sure -- but do you have > any idea how much of a pain in the ass that can be, especially when > the crash occurs on a system that isn't yours? By default, crash > dumps are as large as physical memory, and most systems now have at > least 128MB of RAM. The average user does not know how to analyze > a crash dump, which means even if they get one, they won't be able to > do anything with it, except send it to me. That means I'll have people > throwing 128MB files at me, which I can't even analyze unless I happen > to have a system handy running _exactly_ the same kernel as they are. > This is an amazing amount of trouble to go through to determine a > couple of pieces of info that could just as easily have been printed > on the console, where the user could copy them down and include them > in an e-mail. > > Note that I am not saying we should immediately go back throug the > kernel source and verbosify every single panic() call we come across. > I _am_ saying that developers should not use existing panic() messages > as models for their own. I am also not going to let you lead them > down that path, for that way lies Linux^Wmadness. > > Anyway, to reiterate: I don't think the lineno/sourcefile additions > to panic() are worth the bloat. Tracebacks on the other hand, would > be worth their weight in gold bikesheds. Exactly right. Methinks that lineno/sourcefile is rather easily obtainable, but the state of a disaster is a lot harder nut to crack, unaided. ./danfe
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040121094627.GA44278>