From owner-freebsd-stable@FreeBSD.ORG Tue Oct 16 00:33:50 2007 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A30E716A421 for ; Tue, 16 Oct 2007 00:33:50 +0000 (UTC) (envelope-from kris@FreeBSD.org) Received: from weak.local (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx1.freebsd.org (Postfix) with ESMTP id DDDF313C480; Tue, 16 Oct 2007 00:33:49 +0000 (UTC) (envelope-from kris@FreeBSD.org) Message-ID: <471406ED.7000307@FreeBSD.org> Date: Tue, 16 Oct 2007 02:33:49 +0200 From: Kris Kennaway User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Esa Karkkainen , stable@freebsd.org References: <20071004165755.GA1049@pp.htv.fi> <47120D83.1010703@FreeBSD.org> <20071015203202.GA17964@pp.htv.fi> In-Reply-To: <20071015203202.GA17964@pp.htv.fi> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: Reproducable, possibly NFS related, fatal double fault in 6.2-R-p7 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2007 00:33:50 -0000 Esa Karkkainen wrote: > On Sun, Oct 14, 2007 at 02:37:23PM +0200, Kris Kennaway wrote: >> Esa Karkkainen wrote: >>> I get "Fatal double fault" error when writing to a filesystem >>> mounted from NFS server. > > I got an offlist reply in which he suggested that the problem might be > in nve driver. > > I installed an additional Intel nic, appropriate lines from dmesg are > as follows > > fxp0: port 0xb000-0xb03f mem > 0xe7200000-0xe7200fff,0xe7000000-0xe70fffff irq 11 at device 6.0 on pci1 > miibus1: on fxp0 > inphy0: on miibus1 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > > After I started to use fxp0, I can dump(8) all the necessary filesystems > to the NFS mount, with out panic. > > When I used nve0 dump(8) or cp(1) managed to write less than megabyte to NFS > mount and then machine paniced. > > It didn't matter if I made dump(8) write to the NFS mount or to a local > filesystem and then copied the file to NFS mount, the end result was a > panic. > >>> Both NFS server and client are running 6.2-RELEASE-p7. > > Both machines have been updated to -p8. > >>> # kgdb kernel.debug /home/crash/vmcore.2 >>> Fatal double fault: >>> eip = 0xc063242a >> Can you look up these IPs in the kernel symbol table (see the developers >> handbook)? This might give at least one clue, although I'm not sure it >> is relevant. > > I'm sorry, but I need to learn alot more about gdb and debugging in > general before I can find that information. IIRC I have written about > ten or twenty lines of C in this millenia. Well, it's explained in explicit detail in that document. C code is not involved. > I do have matching kernel.debug and vmcore files, but kernel modules etc > have been removed before I made new kernel and world. OK, most likely too late then. >> You might also update to RELENG_6, I think there was at least one bug >> fixed that might have caused such a thing. > > At the moment I don't have any stability problems with this machine, but > I can upgrade to RELENG_6 before RELENG_6_3 is branched if that is > necessary. > >> Also try to rule out memory failure etc. > > This machine has two 512MB DDR333 DIMM's. > > I installed sysutils/memtest and ran three simultaneously, first two > allocated 326 MB each and last one allocated 150 MB of memory, so I'd > start to swap. No errors. Well, as you say, such a limited test doesn't mean much. Anyway, it may well have been nve, so see how you go without it. kris