Date: Sun, 16 Nov 2003 14:22:00 -0800 From: Marcel Moolenaar <marcel@xcllnt.net> To: Daniel Eischen <eischen@vigrid.com> Cc: davidxu@freebsd.org Subject: Re: KSE/ia64 broken Message-ID: <20031116222200.GA61279@dhcp01.pn.xcllnt.net> In-Reply-To: <Pine.GSO.4.10.10311161642520.1807-100000@pcnet5.pcnet.com> References: <20031116205616.GB60888@dhcp01.pn.xcllnt.net> <Pine.GSO.4.10.10311161642520.1807-100000@pcnet5.pcnet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Nov 16, 2003 at 04:55:44PM -0500, Daniel Eischen wrote: > On Sun, 16 Nov 2003, Marcel Moolenaar wrote: > > > > The same thread (main thread) is being resumed over and over again > > > which shouldn't happen for this simple program. > > > > Can it be that the thread is deadlocked? There's no forward progress. > > There's only context switching... > > I don't think so. I think the thread stack/frame is corrupted, either > because it is copied out or resumed incorrectly. I'll do some more > digging. I loaded it up in the simulator. The thread is continuously being resumed because of a page fault that results in an upcall, which ends up in the UTS, which selects the same thread, which causes the page fault again. The page fault is the result of a bogus address that in the debugger results in a SIGILL. However, when we don't run in a debugger, the SIGILL doesn't get handled. Hence the non- forward progress. The extensive debug information I posted earlier is therefore still relevant. Now that I have things running in the simulator I'll see if I can figure out where things go wrong. Chances are that we now have an upcall where we didn't have one before and that it exposes incomplete state (such as a thread pointer that hasn't been set). The incomplete state causes the corruption we're seeing. Anyway: I'll be digging too... -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031116222200.GA61279>