From owner-cvs-src@FreeBSD.ORG Wed Jan 21 01:45:39 2004 Return-Path: Delivered-To: cvs-src@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2F89716A4CE; Wed, 21 Jan 2004 01:45:39 -0800 (PST) Received: from mx.nsu.ru (mx.nsu.ru [212.192.164.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0AE6443D2D; Wed, 21 Jan 2004 01:45:36 -0800 (PST) (envelope-from danfe@regency.nsu.ru) Received: from mail by mx.nsu.ru with drweb-scanned (Exim 3.35 #1 (Debian)) id 1AjF5n-0007hm-00; Wed, 21 Jan 2004 15:56:11 +0600 Received: from regency.nsu.ru ([193.124.210.26]) by mx.nsu.ru with esmtp (Exim 3.35 #1 (Debian)) id 1AjF5k-0007gS-00; Wed, 21 Jan 2004 15:56:08 +0600 Received: from regency.nsu.ru (localhost [127.0.0.1]) by regency.nsu.ru (8.12.9p2/8.12.9) with ESMTP id i0L9kVDO080313; Wed, 21 Jan 2004 15:46:31 +0600 (NOVT) (envelope-from danfe@regency.nsu.ru) Received: (from danfe@localhost) by regency.nsu.ru (8.12.9p2/8.12.9/Submit) id i0L9kRk1080135; Wed, 21 Jan 2004 15:46:27 +0600 (NOVT) (envelope-from danfe) Date: Wed, 21 Jan 2004 15:46:27 +0600 From: Alexey Dokuchaev To: Bill Paul Message-ID: <20040121094627.GA44278@regency.nsu.ru> References: <20040120175334.W3279@gamplex.bde.org> <20040120182929.4573C16A4CF@hub.freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040120182929.4573C16A4CF@hub.freebsd.org> User-Agent: Mutt/1.4.1i X-Envelope-To: wpaul@freebsd.org, bde@zeta.org.au, cvs-src@freebsd.org, src-committers@freebsd.org, phk@freebsd.org, cvs-all@freebsd.org cc: cvs-src@freebsd.org cc: cvs-all@freebsd.org cc: src-committers@freebsd.org cc: phk@freebsd.org cc: Bruce Evans Subject: Re: cvs commit: src/sys/alpha/alpha support.s src/sys/i386/i386 swtch.s src/sys/kern kern_shutdown.c src/sys/sys systm.h X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Jan 2004 09:45:39 -0000 On Tue, Jan 20, 2004 at 10:29:29AM -0800, Bill Paul wrote: > > [abuse of __FILE__ and __LINE__ in panic()] > > > This was rejected in all reviews. It gives less information than > > grepping the sources, at some cost (grep at least gives correct line > > numbers when the sources don't quite match the binary). > > > > Please back this out. > > I have to second this. > > > > Ideally a traceback should be printed too, any takers ? > > > > This can be obtained by running a debugger on the panic dump. Line > > numbers and source file names can also be printed by the debugger > > if the executable has at least line numbers in its debugging info. > > That's fine for developers who always run their systems with crash > dumps enabled and posess sufficient debugger fu to analyze them. What > about ordinary users who encounter a kernel panic due to a trap > and need to report it? It would save wear and tear on everyone's > nerves (_ESPECIALLY_ mine) if they could just send in a traceback > rather than developers being forced to do the usual "go back and > do nm ${KERNEL} and tell us what you see" exchange. It seems that instead of _TRYING_ to simplify everyone's life and being smarter than our users, we need to improve on our users' general bug/panic reporting "culture". Probably some in-depth section in the Handbook would be useful. > > [...] > > > Verbose panic messages, and lots of code to print out values of variables > > just before panicing, are another mistake. Short panic messages were > > good enough when debuggers were primitive or nonexistent. Verbose panic > > messages are even less needed now. > > Ok, hang on just a minute. > > Let's be clear here. Adding linenum/sourcefile info to all panic() > calls is kind of pointless, I agree. (If you're too lazy to grep > the source for the panic message, you're just no damn good.) However, Agreed. > there are times when printing the contents of some variables in > the panic string is _INCREDIBLY_ _USEFUL_. Terse panic messages > are annoying -- for that matter so are terse error messages in general. > Error messages are meant to a) inform the user that they need to > perform some action to correct a problem and b) alert a software > developer to a potential bug and, hopefully, how to fix it. It is > not enough to note that an error occured: you have to explain > _WHY_ it occured. > > Really bad panic message: > > panic("mbuf is corrupt (but I'm not telling you exactly " > "what's corrupt about it because god forbid I should " > "make life easier for you)"); > > Good panic message: > > panic("mbuf is corrupt: mbuf %p has m_len of %d and m_flags of 0x%x " > "but m_data is NULL", m, m->m_len, m->m_flags); > > By showing the contents of the suspicious variable/structure, you give > a developer some clue as to what was going on in the system when the > variable/structure was trashed. Maybe m_len is the size of a protocol > header that points to a particular protocol module being suspect. > Maybe m_flags has M_DONGS set which points to code that Alfred wrote. > It may not make sense to dump out every single piece of available state, > but not providing any state at all is just damn rude. > > Could you determine this from the crash dump? Sure -- but do you have > any idea how much of a pain in the ass that can be, especially when > the crash occurs on a system that isn't yours? By default, crash > dumps are as large as physical memory, and most systems now have at > least 128MB of RAM. The average user does not know how to analyze > a crash dump, which means even if they get one, they won't be able to > do anything with it, except send it to me. That means I'll have people > throwing 128MB files at me, which I can't even analyze unless I happen > to have a system handy running _exactly_ the same kernel as they are. > This is an amazing amount of trouble to go through to determine a > couple of pieces of info that could just as easily have been printed > on the console, where the user could copy them down and include them > in an e-mail. > > Note that I am not saying we should immediately go back throug the > kernel source and verbosify every single panic() call we come across. > I _am_ saying that developers should not use existing panic() messages > as models for their own. I am also not going to let you lead them > down that path, for that way lies Linux^Wmadness. > > Anyway, to reiterate: I don't think the lineno/sourcefile additions > to panic() are worth the bloat. Tracebacks on the other hand, would > be worth their weight in gold bikesheds. Exactly right. Methinks that lineno/sourcefile is rather easily obtainable, but the state of a disaster is a lot harder nut to crack, unaided. ./danfe