From owner-freebsd-current@FreeBSD.ORG Wed Nov 24 11:14:44 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3D15A16A4CE; Wed, 24 Nov 2004 11:14:44 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id CB8F143D31; Wed, 24 Nov 2004 11:14:43 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.13.1/8.13.1) with ESMTP id iAOBCtfY012136; Wed, 24 Nov 2004 06:12:55 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)iAOBCr8a012131; Wed, 24 Nov 2004 11:12:54 GMT (envelope-from robert@fledge.watson.org) Date: Wed, 24 Nov 2004 11:12:53 +0000 (GMT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Ganbold In-Reply-To: <6.2.0.14.2.20041124091640.03064eb0@202.179.0.80> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: tomaz.borstnar@over.net cc: cguttesen@yahoo.dk cc: scottl@freebsd.org cc: mhunter@ack.Berkeley.EDU cc: freebsd-current@freebsd.org Subject: Re: Page fault in FreeBSD 5.3 on IBM e325, Dual AMD64 2.2GHz, 4GB RAM, ServeRAID 6M - problem goes away without TCP sack X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Nov 2004 11:14:44 -0000 On Wed, 24 Nov 2004, Ganbold wrote: > At 06:43 PM 11/22/2004, you wrote: > > > > > > > It seems to me the problem is related to network stack and threading. > > > Am I right? How to solve this problem? > > > >I've seen reports of this problem with and without debug.mpsafenet=1, > >which suggests it is a network stack bug but not specific to locking. I've > >also seen reports that disabling TCP SACK will make the problem go away, > >which would be good to confirm. I spent the weekend building up some more > >expertise in TCP and reading a lot of TCP code, and hope to look at this > >problem in more detail today. You may want to try turning off TCP sack > >using net.inet.tcp.sack.enable=0 in sysctl.conf (or loader.conf). > > I turned off TCP sack using net.inet.tcp.sack.enable=0 in sysctl.conf > and it seems like the problem goes away. It is working for more than 20 > hours without any crash. Robert, did you find anything in network stack > code? I'm just curious. I have not yet identified a (the?) bug, although it seems likely that it has to do with the internal TCP state relating to the SACK blocks. What we need right now, I think, is a core dump, preferably on an i386 kernel since our debugging tools work best with that. My recollection, though, is that you're using an amd64 kernel, but there have been reports from others that likely are on i386. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research