Date: Fri, 20 May 2005 16:51:52 -0700 From: Sean McNeil <sean@mcneil.com> To: Doug White <dwhite@gumbysoft.com> Cc: amd64@freebsd.org Subject: Re: help with GPF on 5.4-STABLE Message-ID: <F380AA4D-2307-4062-AAFC-DEB6BDA389E4@mcneil.com> In-Reply-To: <20050520135046.T8229@carver.gumbysoft.com> References: <1116566651.1588.17.camel@server.mcneil.com> <20050520135046.T8229@carver.gumbysoft.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Doug, Thanks for helping me look into this. On May 20, 2005, at 1:59 PM, Doug White wrote: > Lets prune this down: > > On Thu, 19 May 2005, Sean McNeil wrote: > > >> I'm not sure what information to provide from my crash dump. I >> tried to >> burn a CD with my >> >> 'TOSHIBA ' 'CD/DVDW SD-R5372' 'TU31' Removable CD-ROM >> >> via. nautilus CD burner and I get a kernel panic: >> >> May 19 19:41:23 server kernel: Fatal trap 9: general protection >> fault while in kernel mode >> May 19 19:41:23 server kernel: instruction pointer = >> 0x8:0xffffffff801f4d99May 19 19:41:23 server kernel: stack >> pointer = 0x10:0xffffffffb1d7ab80 >> May 19 19:41:23 server kernel: frame pointer = >> 0x10:0xffffff0000c3b000 >> May 19 19:41:23 server kernel: code segment = base >> 0x0, limit 0xfffff, type 0x1b >> May 19 19:41:23 server kernel: = DPL 0, pres 1, long 1, def32 0, >> gran 1 >> May 19 19:41:23 server kernel: processor eflags = interrupt >> enabled, resume, IOPL = 0 >> May 19 19:41:23 server kernel: current process = 5 >> (thread taskq) >> May 19 19:41:23 server kernel: trap number = 9 >> May 19 19:41:23 server kernel: panic: general protection fault >> >> What can I do to get the proper info to the developers? using kgdb, I >> checked the threads (pids) and stack. >> There appears to be a missing return on the lines above. I think it caused you to read the SP for the IP. > kern.timeout.c line 530 is > > 530 mtx_unlock_spin(&callout_lock); I don't think this is the problem. I think it is happening inside an interrupt handler while the thread was at this point. > I'm not sure what in there would generate a GPF. Load up a debugging > version of the kernel that generated this error into gdb (add > "makeoptions > DEBUG=-g" to your kernel config & rebuild if you don't have one, > and you > don't need to load in the crashdump), and enter > > disass 0xffffffffb1d7ab80 Looking at 0xffffffff801f4d99 (as that is the IP and above is the SP), I see: (gdb) l *0xffffffff801f4d99 0xffffffff801f4d99 is in ata_completed (/usr/src/sys/dev/ata/ata- queue.c:401). 396 397 ATA_DEBUG_RQ(request, "completed callback/wakeup"); 398 399 /* get results back to the initiator */ 400 if (request->callback) 401 (request->callback)(request); 402 else 403 sema_post(&request->done); 404 405 ata_start(ch); 0xffffffff801f4d87 <ata_completed+103>: mov 0x58(%rbx),%rax 0xffffffff801f4d8b <ata_completed+107>: test %rax,%rax 0xffffffff801f4d8e <ata_completed+110>: data16 0xffffffff801f4d8f <ata_completed+111>: nop 0xffffffff801f4d90 <ata_completed+112>: je 0xffffffff801f4eb5 <ata_completed+405> 0xffffffff801f4d96 <ata_completed+118>: mov %rbx,%rdi 0xffffffff801f4d99 <ata_completed+121>: callq *%eax There is an eax register in 64-bit mode? When I do an info reg in kgdb I don't see one. > It'll disassemble whatever function it is in. Search the addresses > on the > left for the matching line and paste it and a handful to both sides > into > your reply. That will help us narrow things down by seeing what > instruction faulted and searching for conditions that cause that > fault. It would appear that the atapicam layer is somehow setting (or not clearing) the request callback field of a structure. Or, perhaps, there is a reference to the request structure that is happening after the atapicam layer thinks that it is finished and free'd the memory. Does that sound reasonable? Looking at the frame structure, it looks like rax == eax: (kgdb) p/x frame $2 = {tf_rdi = 0xffffff007abafe18, tf_rsi = 0x1, tf_rdx = 0x50, tf_rcx = 0x20, tf_r8 = 0xffffff007b7518b8, tf_r9 = 0xffffff007b75c2c0, tf_rax = 0x50070802106a0, tf_rbx = 0xffffff007abafe18, tf_rbp = 0xffffff0000c3b000, tf_r10 = 0xffffffff806c8c38, tf_r11 = 0x0, tf_r12 = 0x4, tf_r13 = 0x1, tf_r14 = 0xffffff0000b54608, tf_r15 = 0x1, tf_trapno = 0x9, tf_addr = 0x0, tf_flags = 0xffffffff8032355a, tf_err = 0x0, tf_rip = 0xffffffff801f4d99, tf_cs = 0x8, tf_rflags = 0x10206, tf_rsp = 0xffffffffb1d7ab90, tf_ss = 0x10} which would make some sense. As tf_rax looks bogus. What else can I do? Thanks, Sean
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F380AA4D-2307-4062-AAFC-DEB6BDA389E4>