Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Dec 2011 21:13:10 +0400
From:      Gleb Smirnoff <glebius@FreeBSD.org>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        Robert Watson <rwatson@FreeBSD.org>, net@FreeBSD.org
Subject:   Re: Transitioning if_addr_lock to an rwlock
Message-ID:  <20111229171310.GB12721@glebius.int.ru>
In-Reply-To: <201112291136.01889.jhb@freebsd.org>
References:  <201112221130.01823.jhb@freebsd.org> <20111227041728.GI8035@FreeBSD.org> <201112291136.01889.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Dec 29, 2011 at 11:36:01AM -0500, John Baldwin wrote:
J> On Monday, December 26, 2011 11:17:28 pm Gleb Smirnoff wrote:
J> > On Thu, Dec 22, 2011 at 11:30:01AM -0500, John Baldwin wrote:
J> > J> You can find the patch for 8.x at 
J> > J> http://www.freebsd.org/~jhb/patches/if_addr_rwlock.patch
J> > 
J> > Just my two pennies: for head/ patching if ip_carp.c should
J> > be straightforward:
J> > 
J> > 1) Using W in carp_alloc_if() and carp_free_if().
J> > 2) Using R everywhere else.
J> 
J> Yes, this is what I did, but with an extra XXX:
J> 
J> @@ -1512,10 +1512,11 @@ carp_alloc_if(struct ifnet *ifp)
J>  	cif->cif_ifp = ifp;
J>  	TAILQ_INIT(&cif->cif_vrs);
J>  
J> -	IF_ADDR_LOCK(ifp);
J> +	/* XXX: Race, shouldn't this be checking for concurrent calls? */
J> +	IF_ADDR_WLOCK(ifp);
J>  	ifp->if_carp = cif;
J>  	if_ref(ifp);
J> -	IF_ADDR_UNLOCK(ifp);
J> +	IF_ADDR_WUNLOCK(ifp);
J>  
J>  	return (cif);
J>  
J> @@ -1534,10 +1535,10 @@ carp_free_if(struct carp_if *cif)
J>  	KASSERT(TAILQ_EMPTY(&cif->cif_vrs), ("%s: softc list not empty",
J>  	    __func__));
J>  
J> -	IF_ADDR_LOCK(ifp);
J> +	IF_ADDR_WLOCK(ifp);
J>  	ifp->if_carp = NULL;
J>  	if_rele(ifp);
J> -	IF_ADDR_UNLOCK(ifp);
J> +	IF_ADDR_WUNLOCK(ifp);
J>  
J>  	CIF_LOCK_DESTROY(cif);
J>  
J> 
J> Specifically, if two threads both call carp_alloc() at the same time and both 
J> see a value of NULL for ifp->if_carp (and I really do not like side effects 
J> like assignments in conditional expressions of if() and while()), then both
J> threads can call carp_if_alloc().  Instead, carp_if_alloc() should be doing 
J> something like this:
J> 
J> 	IF_ADDR_LOCK(ifp);
J> 	if (ifp->if_carp != NULL) {
J> 		CIF_LOCK_DESTROY(cif);
J> 		free(cif, M_CARP);
J> 		cif = ifp->if_carp;
J> 	} else
J> 		ifp->if_carp = cif;
J> 	IF_ADDR_UNLOCK(ifp);
J> 
J> 	return (cif);
J> 
J> Similarly, you have a race in the SIOCSVH ioctl handling code.  You check
J> for a duplicate carp device for a specific (ifnet, vhid), tuple, but 
J> carp_alloc() doesn't do a recheck when adding the new softc to the cif_vrs
J> list.  Rather, what it should do is move that loop into carp_alloc() while
J> holding the CIF_LOCK() and free the already-created softc and fail 
J> carp_alloc() if it finds a duplicate.
J> 
J> You also have a race between concurrent carp_alloc() and carp_destroy().  
J> Specifically, carp_alloc() might find a cif and be in the process of building 
J> a new carp softc when a carp_destroy() tears down the cif.  The right way to 
J> fix this is to add a reference count to the cif and have carp_alloc_if() 
J> always bump the reference count.  carp_destroy() would need to drop the 
J> reference count, but under IF_ADDR_LOCK() and only do a carp_free_if() if the 
J> count drops to zero.  You'd have to grab IF_ADDR_LOCK() in the caller and let
J> carp_free_if() unlock it internally.

I know all these races. When hacking on new CARP, I understood
that trying to avoid them would make code much more hairy. Also,
races between two threads that change configuration of networking
exist not in CARP only.

So I decided to make a sleepable serializer of the ioctl, like I
suggested on mail thread about if_cloners, and then you noticed that
sx_lock() is already invented and works better :)

So the question is still open, I think that we need some clear and
generic approach like serializing ifioctl(), instead of making zillions
of re-checking and rollbacking when we temporary drop some nerworking
lock for malloc(M_WAITOK).

As temporary measure we can use this serializing lock not in ifioctl()
but in more specific branches of ioctl() call tree, for example in
carp_ioctl().

-- 
Totus tuus, Glebius.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111229171310.GB12721>