From owner-freebsd-stable@FreeBSD.ORG Mon Oct 15 20:32:05 2007 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 778CA16A418 for ; Mon, 15 Oct 2007 20:32:05 +0000 (UTC) (envelope-from ekarkkai@pp.htv.fi) Received: from smtp5.pp.htv.fi (smtp5.pp.htv.fi [213.243.153.39]) by mx1.freebsd.org (Postfix) with ESMTP id E63FB13C442 for ; Mon, 15 Oct 2007 20:32:04 +0000 (UTC) (envelope-from ekarkkai@pp.htv.fi) Received: from zero.my.domain (cs181095217.pp.htv.fi [82.181.95.217]) by smtp5.pp.htv.fi (Postfix) with ESMTP id A04405BC13D for ; Mon, 15 Oct 2007 23:32:03 +0300 (EEST) Received: from thunderbolt.my.domain (thunderbolt.my.domain [10.192.168.30]) by zero.my.domain (8.13.8/8.13.8) with ESMTP id l9FKW3Cf000342 for ; Mon, 15 Oct 2007 23:32:03 +0300 (EEST) (envelope-from ekarkkai@pp.htv.fi) Received: from thunderbolt.my.domain (localhost [127.0.0.1]) by thunderbolt.my.domain (8.13.8/8.13.8) with ESMTP id l9FKW2kX019963 for ; Mon, 15 Oct 2007 23:32:02 +0300 (EEST) (envelope-from ejk@thunderbolt.my.domain) Received: (from ejk@localhost) by thunderbolt.my.domain (8.13.8/8.13.8/Submit) id l9FKW2E0019962 for stable@freebsd.org; Mon, 15 Oct 2007 23:32:02 +0300 (EEST) (envelope-from ejk) Date: Mon, 15 Oct 2007 23:32:02 +0300 From: Esa Karkkainen To: stable@freebsd.org Message-ID: <20071015203202.GA17964@pp.htv.fi> Mail-Followup-To: Esa Karkkainen , stable@freebsd.org References: <20071004165755.GA1049@pp.htv.fi> <47120D83.1010703@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47120D83.1010703@FreeBSD.org> User-Agent: Mutt/1.4.2.3i Cc: Subject: Re: Reproducable, possibly NFS related, fatal double fault in 6.2-R-p7 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 20:32:05 -0000 On Sun, Oct 14, 2007 at 02:37:23PM +0200, Kris Kennaway wrote: > Esa Karkkainen wrote: > > I get "Fatal double fault" error when writing to a filesystem > >mounted from NFS server. I got an offlist reply in which he suggested that the problem might be in nve driver. I installed an additional Intel nic, appropriate lines from dmesg are as follows fxp0: port 0xb000-0xb03f mem 0xe7200000-0xe7200fff,0xe7000000-0xe70fffff irq 11 at device 6.0 on pci1 miibus1: on fxp0 inphy0: on miibus1 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto After I started to use fxp0, I can dump(8) all the necessary filesystems to the NFS mount, with out panic. When I used nve0 dump(8) or cp(1) managed to write less than megabyte to NFS mount and then machine paniced. It didn't matter if I made dump(8) write to the NFS mount or to a local filesystem and then copied the file to NFS mount, the end result was a panic. > > Both NFS server and client are running 6.2-RELEASE-p7. Both machines have been updated to -p8. > ># kgdb kernel.debug /home/crash/vmcore.2 > >Fatal double fault: > >eip = 0xc063242a > > Can you look up these IPs in the kernel symbol table (see the developers > handbook)? This might give at least one clue, although I'm not sure it > is relevant. I'm sorry, but I need to learn alot more about gdb and debugging in general before I can find that information. IIRC I have written about ten or twenty lines of C in this millenia. I do have matching kernel.debug and vmcore files, but kernel modules etc have been removed before I made new kernel and world. > You might also update to RELENG_6, I think there was at least one bug > fixed that might have caused such a thing. At the moment I don't have any stability problems with this machine, but I can upgrade to RELENG_6 before RELENG_6_3 is branched if that is necessary. > Also try to rule out memory failure etc. This machine has two 512MB DDR333 DIMM's. I installed sysutils/memtest and ran three simultaneously, first two allocated 326 MB each and last one allocated 150 MB of memory, so I'd start to swap. No errors. I know these test are not conclusive, but I don't think DIMM's are faulty. -- "In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move." -- Douglas Adams 1952 - 2001