From owner-freebsd-net@FreeBSD.ORG  Sun Oct 11 20:30:58 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 49B0E1065672;
	Sun, 11 Oct 2009 20:30:58 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 246F68FC0A;
	Sun, 11 Oct 2009 20:30:58 +0000 (UTC)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id C582546B09;
	Sun, 11 Oct 2009 16:30:57 -0400 (EDT)
Date: Sun, 11 Oct 2009 21:30:57 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Harsha Srinath <inpcb.harsha@gmail.com>
In-Reply-To: <e1b1c5880910111226o65e0d1a9va975f4cd837271bb@mail.gmail.com>
Message-ID: <alpine.BSF.2.00.0910112126050.48605@fledge.watson.org>
References: <e1b1c5880910111226o65e0d1a9va975f4cd837271bb@mail.gmail.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-current@freebsd.org, net@freebsd.org
Subject: Re: Page fault in IFNET_WLOCK_ASSERT [if.c and pccbb.c]
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 11 Oct 2009 20:30:58 -0000


On Sun, 11 Oct 2009, Harsha Srinath wrote:

> I'm running an updated HEAD kernel and got a page fault in 
> ifindex_alloc_locked() in if.c. I figured that the problem was caused by the 
> (pluggable) network card of my laptop and found that during the 
> initialization of the interface, cb_event_thread() takes the giant lock and 
> up the call chain in if_alloc(), we call IFNET_WLOCK() and assert on the RW 
> locks in ifindex_alloc_locked(). It is in the asset macro 
> IFNET_WLOCK_ASSERT() I see the page fault.
>
> I looked up some recent related changes and noticed the following comment in 
> one of the check-ins in- 
> http://svn.freebsd.org/viewvc/base/head/sys/net/if.c
>
> "Break out allocation of new ifindex values from if_alloc() and if_vmove(), 
> and centralize in a single function ifindex_alloc(). Assert the IFNET_WLOCK, 
> and add missing IFNET_WLOCK in if_alloc(). This does not close all known 
> races in this code."
>
> So I think I have hit one of those fault conditions.
>
> Apparently the giant lock code was removed and added back recently - 
> http://svn.freebsd.org/viewvc/base/head/sys/dev/pccbb/pccbb.c
>
> I believe that the root cause is that ifnet_rw is a non sleepable exclusive 
> RW lock and we have taken the exclusive sleep mutex Giant before that.
>
> Any pointers and suggestions are welcome.

Hi Harsha--

Giant is a bit special in that the long-term sleep code in the kernel knows to 
drop it when sleeping, and re-acquire when waking up.  So, unlike all other 
mutexes, it should be OK to hold it in this case, as Giant will simply get 
dropped if the kernel has to sleep waiting on a sleepable lock.  This is 
because, historically in FreeBSD 3.x/4.x, the kernel was protected by a single 
spinlock, which would get released whenever the kernel stopped executing, such 
as during an I/O sleep.  On the whole, Giant has disappeared from the modern 
kernel, but where it is used, it retains those curious historic properties.

To break things down a bit further, IFNET_WLOCK is, itself, a bit special: 
notice that in FreeBSD 8, it's actually two locks, a sleep lock, and a mutex, 
which must both be acquired exclusively to ensure mutual exclusion. 
if_alloc() and associated calls are also sleepable because they perform 
potentially sleeping memory allocation (M_WAITOK), so it's an invariant of any 
code calling interface allocation that it must be able to tolerate a sleep.

Do you have a copy of the stack trace and fault information handy?  In my 
experience, a NULL pointer deref or other page fault in the locking code for a 
global lock is almost always corrupted thread state, perhaps due to tripping 
over another thread having locked a corrupted/freed/uninitialized lock.  We 
might be able to track that down by tracing other threads that were in 
execution at the time of the panic.

Robert