From owner-freebsd-net@FreeBSD.ORG Fri Jan 6 20:39:57 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9BB51106564A; Fri, 6 Jan 2012 20:39:57 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 604FD8FC17; Fri, 6 Jan 2012 20:39:57 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [96.47.65.170]) by cyrus.watson.org (Postfix) with ESMTPSA id 12DC546B2E; Fri, 6 Jan 2012 15:39:57 -0500 (EST) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 94F12B971; Fri, 6 Jan 2012 15:39:56 -0500 (EST) From: John Baldwin To: freebsd-net@freebsd.org Date: Fri, 6 Jan 2012 15:39:55 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <201112221130.01823.jhb@freebsd.org> <201112291527.26763.jhb@freebsd.org> In-Reply-To: <201112291527.26763.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201201061539.55984.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 06 Jan 2012 15:39:56 -0500 (EST) Cc: "Bjoern A. Zeeb" , Robert Watson Subject: Re: Transitioning if_addr_lock to an rwlock X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jan 2012 20:39:57 -0000 On Thursday, December 29, 2011 3:27:26 pm John Baldwin wrote: > On Thursday, December 22, 2011 11:30:01 am John Baldwin wrote: > > Another bit of lock contention I ran into between a device driver doing slow > > MAC filter updates and the receive path is IF_ADDR_LOCK(). NIC device drivers > > typically hold this lock while iterating the list of multicast addresses to > > program their MAC filter. OTOH, ip_input() uses this lock to check to see if > > an incoming packet is a broadcast packet or not. So even with the pcbinfo > > contention from my previous patch addressed, I still ran into a problem with > > IF_ADDR_LOCK(). We already have some partial support for making this lock be > > an rwlock (the APIs that drivers now use implies an rlock), so I finished the > > transition and checked various non-driver users to see which ones could use a > > read lock (most uses can). The current patch I have for this is on 8, but if > > folks think this is a good idea I can work on porting it to HEAD. For HEAD > > the strategy I would use would be to split this up into 2 phases: > > > > 1) Add IF_ADDR locking macros to differentiate read locks vs write locks along > > with appropriate assertion macros. Update current users of the locking > > macros to use either read or write locks explicitly. To preserve KPI, > > the existing LOCK/UNLOCK macros would map to write lock operations. In > > the initial patch, the locking would still be implemented via a mtx. > > > > 2) Replace the mutex with an rwlock and update the locking macros as > > appropriate. > > > > Phase 1 should definitely be MFC'able. Phase 2 may or may not be. Robert had > > the foresight to change drivers to use explicit function wrappers around > > IF_ADDR_LOCK, and sizeof(struct mtx) == sizeof(struct rwlock), so if we > > changed the lock type the KBI for existing device drivers would all be fine. > > Most of the remaining uses of the locking macros are in parts of the kernel > > that aren't loadable (such as inet and inet6). We can look at the places that > > to do change and if any of them are in kld's then it would be up to re@ to > > decide if 2) was actually safe to merge. However, even if Phase 2 cannot be > > MFC'd, having phase 1 makes it easier for downstream consumers to apply Phase > > 2 locally if they need it. > > I've gone ahead with this approach. I have three separate patches that should > implement Phase 1. All of them can be found at > http://www.FreeBSD.org/~jhb/patches/ > > - if_addr_dev.patch This fixes a few new device drivers that were using > the locking macros directly rather than the wrapper > functions Robert added. I've already sent this > directly to the relevant driver maintainers for their > review. > - if_addr_macros.patch This adds new locking macros to support read locks vs > write locks. However, they all still map to mutex > operations. > - if_addr_uses.patch This changes callers of the existing macros to use > either read or write locks. This is the patch that > could use the most review. Now that all of this is in the tree, here is the small patch to cut the locks over to rwlocks rather than mutexes: Index: kern/subr_witness.c =================================================================== --- kern/subr_witness.c (revision 229726) +++ kern/subr_witness.c (working copy) @@ -520,7 +520,7 @@ { "udpinp", &lock_class_rw }, { "in_multi_mtx", &lock_class_mtx_sleep }, { "igmp_mtx", &lock_class_mtx_sleep }, - { "if_addr_mtx", &lock_class_mtx_sleep }, + { "if_addr_lock", &lock_class_rw }, { NULL, NULL }, /* * IPv6 multicast: @@ -529,7 +529,7 @@ { "udpinp", &lock_class_rw }, { "in6_multi_mtx", &lock_class_mtx_sleep }, { "mld_mtx", &lock_class_mtx_sleep }, - { "if_addr_mtx", &lock_class_mtx_sleep }, + { "if_addr_lock", &lock_class_rw }, { NULL, NULL }, /* * UNIX Domain Sockets Index: net/if_var.h =================================================================== --- net/if_var.h (revision 229726) +++ net/if_var.h (working copy) @@ -189,7 +189,7 @@ int if_afdata_initialized; struct rwlock if_afdata_lock; struct task if_linktask; /* task for link change events */ - struct mtx if_addr_mtx; /* mutex to protect address lists */ + struct rwlock if_addr_lock; /* lock to protect address lists */ LIST_ENTRY(ifnet) if_clones; /* interfaces of a cloner */ TAILQ_HEAD(, ifg_list) if_groups; /* linked list of groups per if */ @@ -246,15 +246,14 @@ /* * Locks for address lists on the network interface. */ -#define IF_ADDR_LOCK_INIT(if) mtx_init(&(if)->if_addr_mtx, \ - "if_addr_mtx", NULL, MTX_DEF) -#define IF_ADDR_LOCK_DESTROY(if) mtx_destroy(&(if)->if_addr_mtx) -#define IF_ADDR_WLOCK(if) mtx_lock(&(if)->if_addr_mtx) -#define IF_ADDR_WUNLOCK(if) mtx_unlock(&(if)->if_addr_mtx) -#define IF_ADDR_RLOCK(if) mtx_lock(&(if)->if_addr_mtx) -#define IF_ADDR_RUNLOCK(if) mtx_unlock(&(if)->if_addr_mtx) -#define IF_ADDR_LOCK_ASSERT(if) mtx_assert(&(if)->if_addr_mtx, MA_OWNED) -#define IF_ADDR_WLOCK_ASSERT(if) mtx_assert(&(if)->if_addr_mtx, MA_OWNED) +#define IF_ADDR_LOCK_INIT(if) rw_init(&(if)->if_addr_lock, "if_addr_lock") +#define IF_ADDR_LOCK_DESTROY(if) rw_destroy(&(if)->if_addr_lock) +#define IF_ADDR_WLOCK(if) rw_wlock(&(if)->if_addr_lock) +#define IF_ADDR_WUNLOCK(if) rw_wunlock(&(if)->if_addr_lock) +#define IF_ADDR_RLOCK(if) rw_rlock(&(if)->if_addr_lock) +#define IF_ADDR_RUNLOCK(if) rw_runlock(&(if)->if_addr_lock) +#define IF_ADDR_LOCK_ASSERT(if) rw_assert(&(if)->if_addr_lock, RA_LOCKED) +#define IF_ADDR_WLOCK_ASSERT(if) rw_assert(&(if)->if_addr_lock, RA_WLOCKED) /* XXX: Compat. */ #define IF_ADDR_LOCK(if) IF_ADDR_WLOCK(if) #define IF_ADDR_UNLOCK(if) IF_ADDR_WUNLOCK(if) -- John Baldwin