From owner-freebsd-hackers Wed Nov 15 3:23:30 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from mass.osd.bsdi.com (adsl-63-206-90-77.dsl.snfc21.pacbell.net [63.206.90.77]) by hub.freebsd.org (Postfix) with ESMTP id E849737B4CF for ; Wed, 15 Nov 2000 03:23:26 -0800 (PST) Received: from mass.osd.bsdi.com (localhost [127.0.0.1]) by mass.osd.bsdi.com (8.11.0/8.11.1) with ESMTP id eAFBToF02993; Wed, 15 Nov 2000 03:29:50 -0800 (PST) (envelope-from msmith@mass.osd.bsdi.com) Message-Id: <200011151129.eAFBToF02993@mass.osd.bsdi.com> X-Mailer: exmh version 2.1.1 10/15/1999 To: Richard Hodges Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: page fault question In-reply-to: Your message of "Tue, 14 Nov 2000 10:58:30 PST." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 15 Nov 2000 03:29:50 -0800 From: Mike Smith Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > I have been having a great time :-) debugging a device driver, > and have run into a really fun way to panic. With one type > of traffic, [something] happens and the kernel drops into > DDB, just the way I want. 8) > Well, actually DDB seems to get trapped in some kind of loop > that spews messages faster than a human can read them. When > I finally got a piece of a clue, I booted with serial console > and captured the following (also an endless loop): > > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x8 > fault code = supervisor read, page not present > instruction pointer = 0x8:0xc014ed6b > stack pointer = 0x10:0xc02b1360 > frame pointer = 0x10:0xc02b1388 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = Idle > interrupt mask = net tty bio cam > kernel: type 12 trap, code=0 > Stopped at > > The PC seems to have died in the DDB, that's odd (or maybe not?) > ts7# nm /kernel | grep c014ed > c014ed38 T linker_ddb_search_symbol > c014edbc T linker_ddb_symbol_values This is pretty normal; ddb is a little fragile sometimes. You want to go back and look at the very first trap; it will probably be different and will be the *real* trap. All the rest are just ddb exploding. > Now looking back at the panic message, it looks like the stack has > pushed into the "frame pointer". Is this an actual problem, or > just some side effect of the page fault? The frame pointer is a pointer into the stack, so no, it's not a problem. > Should I start spending my time looking for kernel stack hogs in > the device driver? I can very easily add code to log ESP & EBP; > would that be productive? Typically stack overruns lead to double faults (because there's no stack on which to handle the fault) and a spontaneous reboot. This just sounds like there's something about your first trap that kills DDB (eg. an invalid instruction pointer, etc.) > Is there a maximum size for a softc? Maybe I'm accidentally ignoring > some "code of the west" and am getting punished for it? (It wouldn't > be the first time). Softc structures should never be allocated on the stack, they're malloc'ed by the newbus infrastructure so you should be OK there. Hope this helps; let us know if the first trap isn't any more illuminating. You might also try using remote gdb instead of ddb. Regards, Mike -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message