From owner-freebsd-current@FreeBSD.ORG  Mon Nov 22 10:45:29 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 04A2716A4CE; Mon, 22 Nov 2004 10:45:29 +0000 (GMT)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 98B3E43D5E; Mon, 22 Nov 2004 10:45:28 +0000 (GMT)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (localhost [127.0.0.1])
	by fledge.watson.org (8.13.1/8.13.1) with ESMTP id iAMAhjsw024954;
	Mon, 22 Nov 2004 05:43:45 -0500 (EST)
	(envelope-from robert@fledge.watson.org)
Received: from localhost (robert@localhost)iAMAhhFj024951;
	Mon, 22 Nov 2004 10:43:44 GMT
	(envelope-from robert@fledge.watson.org)
Date: Mon, 22 Nov 2004 10:43:43 +0000 (GMT)
From: Robert Watson <rwatson@freebsd.org>
X-Sender: robert@fledge.watson.org
To: Ganbold <ganbold@micom.mng.net>
In-Reply-To: <6.2.0.14.2.20041122151958.0303be20@202.179.0.80>
Message-ID: <Pine.NEB.3.96L.1041122104130.19086Q-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: tomaz.borstnar@over.net
cc: cguttesen@yahoo.dk
cc: freebsd-current@freebsd.org
cc: Scott Long <scottl@freebsd.org>
cc: mhunter@ack.Berkeley.EDU
Subject: Re: Page fault in FreeBSD 5.3 on IBM e325, Dual AMD64 2.2GHz,  4GB
	RAM, ServeRAID 6M - debug logs
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Nov 2004 10:45:29 -0000


On Mon, 22 Nov 2004, Ganbold wrote:

> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 01
> fault virtual address	= 0x18
> fault code		= supervisor read, page not present
> instruction pointer	= 0x8:0xffffffff80277fc0
> stack pointer	        = 0x10:0xffffffffb36ab830
> frame pointer	        = 0x10:0xffffffffb36ab890
> code segment		= base 0x0, limit 0xfffff, type 0x1b
> 			= DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags	= interrupt enabled, resume, IOPL = 0
> current process		= 44 (swi1: net)
> [thread 100044]
> Stopped at      m_copym+0x190:  incl    %ecx
<...>
> --------------------------------------------------------------------------------------------------------
> 
> It seems to me the problem is related to network stack and threading.
> Am I right? How to solve this problem?

I've seen reports of this problem with and without debug.mpsafenet=1,
which suggests it is a network stack bug but not specific to locking. I've
also seen reports that disabling TCP SACK will make the problem go away,
which would be good to confirm.  I spent the weekend building up some more
expertise in TCP and reading a lot of TCP code, and hope to look at this
problem in more detail today.  You may want to try turning off TCP sack
using net.inet.tcp.sack.enable=0 in sysctl.conf (or loader.conf).

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Principal Research Scientist, McAfee Research