From owner-freebsd-current@FreeBSD.ORG Mon Nov 22 10:45:29 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 04A2716A4CE; Mon, 22 Nov 2004 10:45:29 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 98B3E43D5E; Mon, 22 Nov 2004 10:45:28 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.13.1/8.13.1) with ESMTP id iAMAhjsw024954; Mon, 22 Nov 2004 05:43:45 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)iAMAhhFj024951; Mon, 22 Nov 2004 10:43:44 GMT (envelope-from robert@fledge.watson.org) Date: Mon, 22 Nov 2004 10:43:43 +0000 (GMT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Ganbold In-Reply-To: <6.2.0.14.2.20041122151958.0303be20@202.179.0.80> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: tomaz.borstnar@over.net cc: cguttesen@yahoo.dk cc: freebsd-current@freebsd.org cc: Scott Long cc: mhunter@ack.Berkeley.EDU Subject: Re: Page fault in FreeBSD 5.3 on IBM e325, Dual AMD64 2.2GHz, 4GB RAM, ServeRAID 6M - debug logs X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Nov 2004 10:45:29 -0000 On Mon, 22 Nov 2004, Ganbold wrote: > Fatal trap 12: page fault while in kernel mode > cpuid = 1; apic id = 01 > fault virtual address = 0x18 > fault code = supervisor read, page not present > instruction pointer = 0x8:0xffffffff80277fc0 > stack pointer = 0x10:0xffffffffb36ab830 > frame pointer = 0x10:0xffffffffb36ab890 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 44 (swi1: net) > [thread 100044] > Stopped at m_copym+0x190: incl %ecx <...> > -------------------------------------------------------------------------------------------------------- > > It seems to me the problem is related to network stack and threading. > Am I right? How to solve this problem? I've seen reports of this problem with and without debug.mpsafenet=1, which suggests it is a network stack bug but not specific to locking. I've also seen reports that disabling TCP SACK will make the problem go away, which would be good to confirm. I spent the weekend building up some more expertise in TCP and reading a lot of TCP code, and hope to look at this problem in more detail today. You may want to try turning off TCP sack using net.inet.tcp.sack.enable=0 in sysctl.conf (or loader.conf). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research