Date: Thu, 15 Oct 1998 23:02:23 -0700 From: Mike Smith <mike@smith.net.au> To: cgd@netbsd.org (Chris G. Demetriou) Cc: dg@root.com, Jason Thorpe <thorpej@nas.nasa.gov>, Andrew Gallatin <gallatin@cs.duke.edu>, Chris Csanady <ccsanady@friley-185-114.res.iastate.edu>, freebsd-alpha@FreeBSD.ORG Subject: Re: kernel traps on boot.. Message-ID: <199810160602.XAA00878@dingo.cdrom.com> In-Reply-To: Your message of "14 Oct 1998 18:59:18 PDT." <8767dmoaa1.fsf@netbsd1.cygnus.com>
next in thread | previous in thread | raw e-mail | index | archive | help
It would probably be fair to say that this neatly encapsulates the philosphical differences between FreeBSD-current and NetBSD-current, and it's not surprising that there's some confusion between the two groups. > David Greenman <dg@root.com> writes: > > >Just doing printfs for broken kernel code only encourages laziness. > > > > Well, that might be fine for a developer, but it sure doesn't help end > > users. We *are* trying to provide a production system after all. :-) > > If code is sufficiently untested that it randomly runs into unaligned > accesses, then by definition, it isn't a production-quality system and > you don't need to worry about panic()ing. > > However, if it _is_ well tested, "production quality," and still runs > into that unaligned access, then that unaligned access is probably > indicative of a somewhat-serious bug. It means either that code is > getting a bogus value because of specification/implementation "issue," > or that something, somewhere got corrupted, and therefore the system > lost. > > To have such bugs fixed properly, in many cases, a developer will need > to know more about the context in which it occurred than just the fact > that it occurred, the PC, and a few registers. That means panic, > followed by kernel core dump (or invocation of kernel debugger, or > whatever), which then gets handed by the user of the production system > to a developer, who debugs it. FreeBSD policy for new code, is to commit early and fix fast. Because most developers track -current very aggressively, committing code which causes "diagnostic panics" is not a popular option. If the code was on a reasonably common path, it would prevent developers working on unrelated issues from doing anything useful until the problem was resolved (and possibly slow the adoption of the resolution). This places the development cycle somewhat in lockstep, where only one misfeature can be resolved at a time. Instead, FreeBSD developers tend to be a talkative bunch, and the existence of a "diagnostic printf" will cause those seeing it to pipe up and identify themselves to the owner of the code in question, allowing said developer to immediately interact with users having suitable test environments for reproducing the problem without locking everyone else out. > In my opinion, it's not only bad, but _irresponsible_ to let the > system bumble on in the face of such a bug. High uptime is nice, but > if it comes at the cost of ignoring serious system errors or > corrupting data, it's worthless. I don't think anyone would disagree with you here. However an unaligned access doesn't fit into this case, as you can handle it cleanly (while tagging the problem as an error) without crying wolf. -- \\ Sometimes you're ahead, \\ Mike Smith \\ sometimes you're behind. \\ mike@smith.net.au \\ The race is long, and in the \\ msmith@freebsd.org \\ end it's only with yourself. \\ msmith@cdrom.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199810160602.XAA00878>