From owner-freebsd-stable@FreeBSD.ORG Thu May 15 12:56:38 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8190E37B405 for ; Thu, 15 May 2003 12:56:38 -0700 (PDT) Received: from cirb503493.alcatel.com.au (c18609.belrs1.nsw.optusnet.com.au [210.49.80.204]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8818D43FB1 for ; Thu, 15 May 2003 12:56:36 -0700 (PDT) (envelope-from peterjeremy@optushome.com.au) Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au [127.0.0.1])h4FJuYp9021616; Fri, 16 May 2003 05:56:34 +1000 (EST) (envelope-from jeremyp@cirb503493.alcatel.com.au) Received: (from jeremyp@localhost) by cirb503493.alcatel.com.au (8.12.8/8.12.8/Submit) id h4FJuGe2021615; Fri, 16 May 2003 05:56:16 +1000 (EST) Date: Fri, 16 May 2003 05:56:16 +1000 From: Peter Jeremy To: Jonas =?iso-8859-1?Q?B=FClow?= Message-ID: <20030515195616.GE4366@cirb503493.alcatel.com.au> References: <3EC10790.50809@bulow.mine.nu> <20030514100716.GA4410@cirb503493.alcatel.com.au> <3EC22DB4.70409@bulow.mine.nu> <20030514200331.GD4366@cirb503493.alcatel.com.au> <3EC377BC.5060708@servicefactory.se> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3EC377BC.5060708@servicefactory.se> User-Agent: Mutt/1.4.1i cc: freebsd-stable Subject: Re: Kernel panic on FreeBSD 4.8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 May 2003 19:56:38 -0000 Hi Jonas, Debugging this is going to take some time - especially given the different timezones we are in. The trap is occurring in the 'ltr' that actually does the task switch. On Thu, May 15, 2003 at 01:19:24PM +0200, Jonas Bülow wrote: >Hi, > >Peter Jeremy wrote: >>On Wed, May 14, 2003 at 01:51:16PM +0200, Jonas Bulow wrote: >> >>>Peter Jeremy wrote: >>> >>>>On Tue, May 13, 2003 at 04:56:16PM +0200, Jonas Bulow wrote: >>>> >>>> >>>>>I need some help to understand a backtrace. >>>> >>>> >>>>>Fatal trap 9: general protection fault while in kernel mode >>>>>instruction pointer = 0x8:0xc023ceeb >>>>>stack pointer = 0x10:0xcf7d9ea4 >>>>>frame pointer = 0x10:0xcf7d9ec0 >>>>>code segment = base 0x0, limit 0xfffff, type 0x1b >>>>> = DPL 0, pres 1, def32 1, gran 1 >>>>>processor eflags = resume, IOPL = 0 >>>>>current process = Idle >>>>>interrupt mask = net tty bio cam >>>>>trap number = 9 >>>>>panic: general protection fault >>>> >>>>... >>>> >>>> >>>>>#17 0xc023d6fb in trap (frame={tf_fs = 16, tf_es = 134938640, tf_ds = >>>>>-982253552, tf_edi = -971835344, tf_esi = 32, >>>>> tf_ebp = -813850944, tf_isp = -813850992, tf_ebx = -1070885216, >>>>>tf_edx = -812732416, tf_ecx = -831483840, >>>>> tf_eax = 336283586, tf_trapno = 9, tf_err = 32, tf_eip = >>>>>-1071395093, tf_cs = 8, tf_eflags = 65670, tf_esp = -1072211888, >>>>> tf_ss = -831471360}) at /usr/src/sys/i386/i386/trap.c:636 >>>>>#18 0xc023ceeb in sw1a () >>>>>#19 0xc0174ff1 in tsleep (ident=0xce70c100, priority=288, >>>>>wmesg=0xc02530a5 "wait", timo=0) at /usr/src/sys/kern/kern_synch.c:479 >>>> >>>> >>>>#18 is the underlying problem. sw1a() is in /sys/i386/i386/swtch.s >>>>and you might like to disassemble the code around 0xc023ceeb to see >>>>exactly where it is dying. GPF is a catch-all category so it's >>>>difficult to know exactly why you're getting it without knowing the >>>>actual instruction it dies on. >>> >>>This is beyond my skills. :-) Does the disassemble say anything usefull? >>> >>>(kgdb) disassemble 0xc023ceeb >> >>... >> >>>0xc023cecf : mov $0xc0298550,%edi >>>0xc023ced4 : mov 0xc0298558,%ebx >>>0xc023ceda : mov 0x0(%edi),%eax >>>0xc023cedd : mov %eax,0x0(%ebx) >>>0xc023cee0 : mov 0x4(%edi),%eax >>>0xc023cee3 : mov %eax,0x4(%ebx) >>>0xc023cee6 : mov $0x20,%esi >>>0xc023ceeb : ltr %si >> >> >>It's dying trying to switch tasks. %edi isn't _common_tssd so it's a >>private TSS. This is a bit beyond my skills to debug remotely - I >>don't suppose you have a iA32 system programming manual handy? > >I have the manuals found at >http://developer.intel.com/design/pentium4/manuals/ . Chapter 6 in >volume 3 seems to be the home work for me. :-) Looks right from the cover. I'll check at work. (I'm using an old 486 book at home - it's the 'Multi-tasking' chapter in that). You might need to read the "System Architecture" chapter for some background as well. >> You >>could try printing the 8 bytes following %edi in frame #18 >>(0xc612f830) > >(kgdb) x/8xb 0xc612f830 >0xc612f830: 0x10 0x02 0x00 0x00 0xc2 0x47 0x0b 0x14 This should be the new TSS descriptor. As longwords, this gives 0x00000210 0x140b47c2, which doesn't make sense - though the latter word correctly matches tf_eax. AVL: 0 Busy: ? Base: 0x14c20000 - this isn't valid. Kernel addresses are 0xcXXXXXXX D: 0 DPL: 2 G: 0 Limit: 0xb0210 - This is excessive P: 0 - this must be 1 Type: 7 - describes it as a 80286 Trap Gate. It should be 9. It would make more sense if it was byte-swapped. I'm not sure where to go from here. I'll do some more thinking when I get home this evening. Peter