From owner-freebsd-stable@FreeBSD.ORG Wed Jan 14 13:42:37 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3FD9D106564A for ; Wed, 14 Jan 2009 13:42:37 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 1BBBE8FC0C for ; Wed, 14 Jan 2009 13:42:37 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTPS id CAD3D46B0C; Wed, 14 Jan 2009 08:42:36 -0500 (EST) Date: Wed, 14 Jan 2009 13:42:36 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Pete French In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-stable@freebsd.org, drosih@rpi.edu, rblayzor.bulk@inoc.net Subject: Re: Big problems with 7.1 locking up :-( X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Jan 2009 13:42:37 -0000 On Wed, 14 Jan 2009, Pete French wrote: >> If you have BREAK_TO_DEBUGGER compiled into the kernel, then try pressing >> ctrl-alt-break on the console to see if you can drop into the debugger, or >> issue a serial break on a serial console. > > Well, I added BREAK_TO_DEBUGGER to the kernel config I had which contained > all the other stuff (WITNESS etc...). The end result... > > ...it no longer crashes :-( > > I am not sure what to make of that! Wat could adding this to the kernel > possibly do which would make my problems go away ? Should I try just adding > this option to my GENERIC kernel and seeing if that also gives me something > stable ? Yeah, that is unexpected -- the BREAK_TO_DEBUGGER path should have almost know effect on control flow, unlike, say, WITNESS, which significantly distorts timing. Is there any chance you picked up any of the recent fixes that went into RELENG_7 without noticing, and that perhaps one of those did it? With regard to what to do: if you didn't pick up a fix without noticing, yeah, I think it's worth testing the hypothesis that BREAK_TO_DEBUGGER fixed (or at least, masked) the problem. Generally with this sort of testing one has to be pretty rigorous in testing assumptions, because it's easy for changes to sneak in. Particularly annoying are seemingly innocuous code changes that do things like slightly rearrange kernel memory. FWIW, I suspect the various reports we are seeing reflect more than one problem, and that they must be relatively edge-case individually but reports of a few problems have lead to more "coming out of the woodwork". Obviously, the problems are not edge-case to the people experiencing them... Robert N M Watson Computer Laboratory University of Cambridge