From owner-freebsd-current@FreeBSD.ORG Tue Jun 3 23:39:34 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CB25937B401; Tue, 3 Jun 2003 23:39:34 -0700 (PDT) Received: from gravy.homeunix.net (pool-151-197-48-53.phil.east.verizon.net [151.197.48.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id A3A9D43F3F; Tue, 3 Jun 2003 23:39:33 -0700 (PDT) (envelope-from bleez@verizon.net) Received: from gravy.homeunix.net (gravy.homeunix.net [192.168.1.2]) by gravy.homeunix.net (8.12.9/8.12.9) with ESMTP id h546dUGa000505; Wed, 4 Jun 2003 02:39:30 -0400 (EDT) (envelope-from bleez@verizon.net) Date: Wed, 4 Jun 2003 02:39:30 -0400 (EDT) From: Bryan Liesner To: Robert Watson In-Reply-To: Message-ID: <20030604022003.M481@gravy.homeunix.net> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Jeff Roberson cc: current@freebsd.org Subject: Re: umtx/libthr SMP fixes. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Jun 2003 06:39:35 -0000 On Tue, 3 Jun 2003, Robert Watson wrote: > > On Tue, 3 Jun 2003, Bryan Liesner wrote: > > > Actually, no it doesn't. I was able to use kern_umtx v 1.3 only if I > > removed atapicam from my kernel config. These patches (now committed?) > > panic the system whether I use atapicam or not. With kern_umtx v1.2 > > there is no panic at all, with or without atapicam. > > > > Actually, I think it's cam in general that's causing the panic with > > these changes. > > Bizarre. Sounds like an errant pointer in some other code, and it's just > a matter of the memory layout as to what gets stepped on. Alternatively, > it might be affected by the insertion of the MTX sysinit event. Perhaps > that revision rearranges memory a bit. Even more bizarre. I have cvsupped to the latest source, built a kernel with DDB and it won't panic. Without DDB, it panics. But the behavior has changed a bit. I now panics _without_ atapicam in the build, at boot time. With atapicam, it panics and dumps core if I do an init 6. Savecore refuses to grab the dump: gravy savecore: first and last dump headers disagree on /dev/ad0s1b gravy savecore: unsaved dumps found but not saved I cleared the dump and tried again with the same results. If I reboot with the USB drive mounted, it will panic on the init 6, unmounted, it reboots without trouble. Any hints on grabbing a dump without savecore complaining, please let me know. I don't have anything specific to report yet, when I have time tomorrow I'll try to get more information out. > > Anyhow, here are some things you might consider, since this whole thing is > so odd. Try merging the addition of the struct mtx declaration from 1.3 > into 1.2 and see if you get the same panic. If you don't, try merging the > MTX_SYSINIT line and see if that triggers the panic. The other changes > probably wouldn't cause disruptive memory rearrangement, so see what > happens. If the panics appear with the addition of the variable, it > probably is a memory stepping thing and a bug in some other piece of code > (unfortunately, probably hard to track down). If it's the addition of the > initializer, it's a different class of problem. Right now I'm at rev 1.4 of kern_umtx... I'll try reverting back and trying this time permitting... > I have to admit that I'm also fairly baffled: my current reading of the > change suggests there won't be a specific bug in umtx, rather, the > triggering of symptoms from another bug, but I guess we can only find out > with a bit of experimentation. You might also find the problem > "disappears" if you remove INVARIANTS, although given that you can > reproduce this nicely, I'm reluctant to have you do that for fear the bug > will get away and not get fixed. INVARIANTS wasn't in the picture to begin with. If I put it in, it will probably disappear, as with using DDB. The code has changed sufficiently now that I can't reproduce the original panic that's in the PR, but it's still panicking... -- ============================================================= = Bryan D. Liesner LeezSoft Communications, Inc. = = A subsidiary of LeezSoft Inc. = = bleez@verizon.net Home of the Gipper = =============================================================