Date: Sat, 10 Jul 2004 19:22:35 +0100 From: Bruce M Simpson <bms@spc.org> To: freebsd-mobile@freebsd.org Cc: Dan Langille <dan@langille.org> Subject: T41 CDRW page fault saga Message-ID: <20040710182235.GA838@empiric.dek.spc.org> In-Reply-To: <20040709013152.GR15368@empiric.dek.spc.org> References: <40EDB001.4311.E98BBCCC@localhost> <20040709013152.GR15368@empiric.dek.spc.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 09, 2004 at 02:31:52AM +0100, Bruce M Simpson wrote: > If we can establish that the problem is isolated to a specific ATA > controller revision, we may be getting somewhere.... I've got more data from the local user's affected machine. We had to manually transcribe the messages as I don't have enough firewire kit around to do dcons. This is the kernel I'm using: FreeBSD empiric.dek.spc.org 5.2-CURRENT FreeBSD 5.2-CURRENT #1: Tue Jul 6 23:17:47 BST 2004 bms@kimchi.dek.spc.org:/usr/src/sys/i386/compile/EMPIRIC i386 There isn't a panic per se. The page fault only manifests itself on the affected T41 when the CDRW module is inserted; if it's removed during boot, all is well. We managed to pull a backtrace. It's clear this happens only during mountroot and it could be a trashed stack. The addresses, of course, are specific to my production -CURRENT kernel (I usually build kernel.debug), I couldn't get a panic (it kept complaining of not having enough room on my dumpdev, although I know for a fact I have enough blocks to cover physical memory which is 512MB on this box). This message occurs immediately after mountroot is attempted (it finds the root filesystem correctly) and after the ATAPI_IDENTIFY messages which others have reported (inspection of the ata driver suggests these messages are benign, but green@ has since posted patches which address the 'device atapicam' case): ---8<---8<--- Fatal trap 12: page fault while in kernel mode fault virtual address = 0x1ff01ff fault code = supervisor read, page not present instruction pointer = 0x08:0x1ff01ff stack pointer = 0x10:0xd3e9cb30 frame pointer = 0x10:0xd3e9cb54 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1 (swapper) kernel: type 12 trap, code = 0 Stopped at 0x1ff01ff ---8<---8<--- (On entry into DDB: eip = 0xc05d4808, esp = 0xd3e9c97c, fp = 0xd3e9c980) We managed to get a backtrace using "show thr" as follows (we didn't transcribe the stack parameters, just the backtrace):- ---8<---8<--- kernload at 0x1ff01ff devfs_allocv at devfs_allocv+0x13c devfs_root at devfs_root+0x23 devfs_nmount at devfs_nmount+0xaf getdiskbyname at getdiskbyname+0xb1 setrootbyname at setrootbyname+0xb vfs_mountroot_try at vfs_mountroot_try+0xcf vfs_mountroot at vfs_mountroot+0x6b start_init at start_init+0x53 fork_exit fork_trampoline ---8<---8<--- I'll try to pin down the exact opcode/line in devfs_allocv() where the call stack appears to be getting to screwed up. Hopefully this helps continuing efforts to debug this problem. Regards, BMS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040710182235.GA838>