From owner-freebsd-arch@FreeBSD.ORG  Sun May 28 23:04:49 2006
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
X-Original-To: freebsd-arch@FreeBSD.org
Delivered-To: freebsd-arch@FreeBSD.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id AFC3C16AF5B;
	Sun, 28 May 2006 23:01:01 +0000 (UTC)
	(envelope-from skip.ford@verizon.net)
Received: from vms046pub.verizon.net (vms046pub.verizon.net [206.46.252.46])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 6A5FC43D46;
	Sun, 28 May 2006 23:01:01 +0000 (GMT)
	(envelope-from skip.ford@verizon.net)
Received: from pool-70-17-33-65.pskn.east.verizon.net ([70.17.33.65])
	by vms046.mailsrvcs.net
	(Sun Java System Messaging Server 6.2-4.02 (built Sep  9 2005))
	with ESMTPA id <0IZZ00MIDZXOYM86@vms046.mailsrvcs.net>; Sun,
	28 May 2006 18:01:01 -0500 (CDT)
Date: Sun, 28 May 2006 19:00:59 -0400
From: Skip Ford <skip.ford@verizon.net>
To: Robert Watson <rwatson@FreeBSD.org>
Mail-followup-to: Robert Watson <rwatson@FreeBSD.org>, freebsd-atm@FreeBSD.org,
	freebsd-arch@FreeBSD.org
Message-id: <20060528230058.GA836@lucy.pool-70-17-33-65.pskn.east.verizon.net>
MIME-version: 1.0
Content-type: text/plain; charset=us-ascii
Content-disposition: inline
User-Agent: Mutt/1.4.2.1i
Cc: freebsd-atm@FreeBSD.org, freebsd-arch@FreeBSD.org
Subject: Locking netatm
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 28 May 2006 23:04:55 -0000

In my attempt to make netatm MPSAFE, I've run into just a couple
situations I'm not sure how to handle.  I've more-or-less replaced
the splnet() calls with a single subsystem lock.  Nearly all of
them were protecting structures within the netatm subsystem.  But
I'm not sure how to handle the existing splimp() calls.

There seem to be a few different reasons why splimp() calls were made.
First, for timing.  Three netatm functions use splimp() to protect
access to the list of atm_timeq structures (struct atm_time *).
Here is the usage from one of those three functions.  I've removed
much of the code to simplify what it does:

	s = splimp();

	FOREACH(atm_timeq);
		...

	... possibly modify structures within atm_timeq ...
	... modify the struct atm_time * passed to this function ...

	(void) splx(s);

So my question is, were network interrupts disabled when mucking
with the atm_timeq list because a generated interrupt can modify
structures within the list?  This use is probably very
netatm-specific.  I'm still studying the timeout code to
understand what it's doing.

A second situation where network interrupts were disabled was for
netatm memory allocation for devices:

in atm_dev_alloc()

	s = splimp();

	FOREACH(atm_mem_head)
		...
	malloc (...)

	(void) splx(s);

and in atm_dev_free()

	s = splimp();

	FOREACH(atm_mem_head)
		...
	free (...);

	(void) splx(s);

I'm not sure how these should be protected.  Presumably, we don't
want to receive interrupts until the netatm memory for the
device is allocated.  Would a global subsystem lock protect these
calls?  I can protect atm_mem_head, so maybe that'd be enough?

Another use is to protect calls to other subsystems.  For
example:

within atm_nif_attach(struct atm_nif *nip)

	ifp = nip->nif_ifp;

	s = splimp();

	if_attach(ifp);
	bpfattach(ifp, DLT_ATM_CLIP, T_ATM_LLC_MAX_LEN);

	(void) splx(s);
}

and within atm_nif_detach(struct atm_nif *nip)

	ifp = nip->nif_ifp;

	s = splimp();

	bpfdetach(ifp);
	if_detach(ifp);
	if_free(ifp);

	(void) splx(s);

Holding a new netatm subsystem lock won't protect those calls so
I'm not sure how to handle those.  Other non-netatm code in the
tree seems to not do any locking at all around those calls.

These are really the only uses I've yet to convert so if someone
can provide some pointers, I'd appreciate it.  I'm pretty new to
FreeBSD locking, either the old way or the new way.  I'm still
studying the code, including other network stacks and the netatm
stack itself, but a pointer or two would be appreciated.  I feel
like it's mostly converted, though I've done no testing at all
yet.  Once I finish removing splimp(), I can test with the single
subsystem lock, then move on to finer-grained locking where
necessary.

-- 
Skip