From owner-freebsd-amd64@FreeBSD.ORG Fri May 20 23:52:05 2005 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6A47E16A4CF for ; Fri, 20 May 2005 23:52:05 +0000 (GMT) Received: from mail.mcneil.com (mcneil.com [24.199.45.54]) by mx1.FreeBSD.org (Postfix) with ESMTP id 81E0143DC3 for ; Fri, 20 May 2005 23:52:04 +0000 (GMT) (envelope-from sean@mcneil.com) Received: from localhost (localhost.mcneil.com [127.0.0.1]) by mail.mcneil.com (Postfix) with ESMTP id 41517F1A40; Fri, 20 May 2005 16:52:02 -0700 (PDT) Received: from mail.mcneil.com ([127.0.0.1]) by localhost (server.mcneil.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 15276-04; Fri, 20 May 2005 16:52:01 -0700 (PDT) Received: from [192.168.10.9] (mobile.mcneil.com [192.168.10.9]) by mail.mcneil.com (Postfix) with ESMTP id EF0A9F19BA; Fri, 20 May 2005 16:52:00 -0700 (PDT) In-Reply-To: <20050520135046.T8229@carver.gumbysoft.com> References: <1116566651.1588.17.camel@server.mcneil.com> <20050520135046.T8229@carver.gumbysoft.com> Mime-Version: 1.0 (Apple Message framework v730) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Sean McNeil Date: Fri, 20 May 2005 16:51:52 -0700 To: Doug White X-Mailer: Apple Mail (2.730) X-Virus-Scanned: by amavisd-new at mcneil.com cc: amd64@freebsd.org Subject: Re: help with GPF on 5.4-STABLE X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2005 23:52:05 -0000 Doug, Thanks for helping me look into this. On May 20, 2005, at 1:59 PM, Doug White wrote: > Lets prune this down: > > On Thu, 19 May 2005, Sean McNeil wrote: > > >> I'm not sure what information to provide from my crash dump. I >> tried to >> burn a CD with my >> >> 'TOSHIBA ' 'CD/DVDW SD-R5372' 'TU31' Removable CD-ROM >> >> via. nautilus CD burner and I get a kernel panic: >> >> May 19 19:41:23 server kernel: Fatal trap 9: general protection >> fault while in kernel mode >> May 19 19:41:23 server kernel: instruction pointer = >> 0x8:0xffffffff801f4d99May 19 19:41:23 server kernel: stack >> pointer = 0x10:0xffffffffb1d7ab80 >> May 19 19:41:23 server kernel: frame pointer = >> 0x10:0xffffff0000c3b000 >> May 19 19:41:23 server kernel: code segment = base >> 0x0, limit 0xfffff, type 0x1b >> May 19 19:41:23 server kernel: = DPL 0, pres 1, long 1, def32 0, >> gran 1 >> May 19 19:41:23 server kernel: processor eflags = interrupt >> enabled, resume, IOPL = 0 >> May 19 19:41:23 server kernel: current process = 5 >> (thread taskq) >> May 19 19:41:23 server kernel: trap number = 9 >> May 19 19:41:23 server kernel: panic: general protection fault >> >> What can I do to get the proper info to the developers? using kgdb, I >> checked the threads (pids) and stack. >> There appears to be a missing return on the lines above. I think it caused you to read the SP for the IP. > kern.timeout.c line 530 is > > 530 mtx_unlock_spin(&callout_lock); I don't think this is the problem. I think it is happening inside an interrupt handler while the thread was at this point. > I'm not sure what in there would generate a GPF. Load up a debugging > version of the kernel that generated this error into gdb (add > "makeoptions > DEBUG=-g" to your kernel config & rebuild if you don't have one, > and you > don't need to load in the crashdump), and enter > > disass 0xffffffffb1d7ab80 Looking at 0xffffffff801f4d99 (as that is the IP and above is the SP), I see: (gdb) l *0xffffffff801f4d99 0xffffffff801f4d99 is in ata_completed (/usr/src/sys/dev/ata/ata- queue.c:401). 396 397 ATA_DEBUG_RQ(request, "completed callback/wakeup"); 398 399 /* get results back to the initiator */ 400 if (request->callback) 401 (request->callback)(request); 402 else 403 sema_post(&request->done); 404 405 ata_start(ch); 0xffffffff801f4d87 : mov 0x58(%rbx),%rax 0xffffffff801f4d8b : test %rax,%rax 0xffffffff801f4d8e : data16 0xffffffff801f4d8f : nop 0xffffffff801f4d90 : je 0xffffffff801f4eb5 0xffffffff801f4d96 : mov %rbx,%rdi 0xffffffff801f4d99 : callq *%eax There is an eax register in 64-bit mode? When I do an info reg in kgdb I don't see one. > It'll disassemble whatever function it is in. Search the addresses > on the > left for the matching line and paste it and a handful to both sides > into > your reply. That will help us narrow things down by seeing what > instruction faulted and searching for conditions that cause that > fault. It would appear that the atapicam layer is somehow setting (or not clearing) the request callback field of a structure. Or, perhaps, there is a reference to the request structure that is happening after the atapicam layer thinks that it is finished and free'd the memory. Does that sound reasonable? Looking at the frame structure, it looks like rax == eax: (kgdb) p/x frame $2 = {tf_rdi = 0xffffff007abafe18, tf_rsi = 0x1, tf_rdx = 0x50, tf_rcx = 0x20, tf_r8 = 0xffffff007b7518b8, tf_r9 = 0xffffff007b75c2c0, tf_rax = 0x50070802106a0, tf_rbx = 0xffffff007abafe18, tf_rbp = 0xffffff0000c3b000, tf_r10 = 0xffffffff806c8c38, tf_r11 = 0x0, tf_r12 = 0x4, tf_r13 = 0x1, tf_r14 = 0xffffff0000b54608, tf_r15 = 0x1, tf_trapno = 0x9, tf_addr = 0x0, tf_flags = 0xffffffff8032355a, tf_err = 0x0, tf_rip = 0xffffffff801f4d99, tf_cs = 0x8, tf_rflags = 0x10206, tf_rsp = 0xffffffffb1d7ab90, tf_ss = 0x10} which would make some sense. As tf_rax looks bogus. What else can I do? Thanks, Sean