From owner-freebsd-current@FreeBSD.ORG  Tue Aug  4 15:18:07 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F1A7A106564A
	for <freebsd-current@freebsd.org>; Tue,  4 Aug 2009 15:18:07 +0000 (UTC)
	(envelope-from lstewart@freebsd.org)
Received: from lauren.room52.net (lauren.room52.net [210.50.193.198])
	by mx1.freebsd.org (Postfix) with ESMTP id 82A858FC19
	for <freebsd-current@freebsd.org>; Tue,  4 Aug 2009 15:18:07 +0000 (UTC)
	(envelope-from lstewart@freebsd.org)
Received: from lstewart-laptop.caia.swin.edu.au (c149.al.cl.cam.ac.uk
	[128.232.110.149]) (authenticated bits=0)
	by lauren.room52.net (8.14.3/8.14.3) with ESMTP id n74FHfdu011887
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 5 Aug 2009 01:17:57 +1000 (EST)
	(envelope-from lstewart@freebsd.org)
Message-ID: <4A785110.9060705@freebsd.org>
Date: Tue, 04 Aug 2009 16:17:36 +0100
From: Lawrence Stewart <lstewart@freebsd.org>
User-Agent: Thunderbird 2.0.0.22 (X11/20090722)
MIME-Version: 1.0
To: Kamigishi Rei <spambox@haruhiism.net>
References: <4A6F0A35.7050809@haruhiism.net> <4A724BA1.7050303@haruhiism.net>
In-Reply-To: <4A724BA1.7050303@haruhiism.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,SPF_SOFTFAIL
	autolearn=disabled version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on lauren.room52.net
Cc: FreeBSD Current <freebsd-current@freebsd.org>
Subject: Re: [follow-up] FreeBSD/amd64 r195146 to r195848,
 fatal trap 12 under network load
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Aug 2009 15:18:08 -0000

Kamigishi Rei wrote:
> Kamigishi Rei wrote:
>> Revisions mentioned are those which were tested by me; r195849+ has 
>> the corruption padded somewhere else so it might produce a panic with 
>> a different set of options. For reference, my test kernel uses a 
>> GENERIC config from May 09 snapshot without WITNESS and with 
>> IPFIREWALL, IPFIREWALL_DEFAULT_TO_ACCEPT and DEVICE_POLLING enabled.
> r195981 (latest checkout) traps with the *GENERIC* kernel (with WITNESS 
> enabled). Same backtrace, same cause, and UP systems are not affected 
> again.
> Apparently, my diagnostics patch from the previous message seems to pad 
> the corruption somewhere, so I can't use it to check lo_witness or other 
> fields of nws_mtx at the time when mtx_lock gets corrupted.
> 
> Trap can be triggered with "ping -f -s 65507 localhost", iperf (just 
> "iperf -c localhost" works for me), or by generating some high-speed 
> network throughput (even a mysql query over localhost will do as we have 
> a race here). Running ping will mostly trigger the trap inside 
> swi_net(); iperf - inside netisr_queue_internal().
> 
> I will be grateful if someone could provide me some information on how 
> to further debug it. Currently, I suspect that there's something about 
> handling modspace (incorrect dereference somewhere, or something like 
> that).

For the benefit of the list, we've finally got this reproduced on a 
netperf cluster node after much gnashing of teeth. Stay tuned for updates.

Cheers,
Lawrence