From owner-freebsd-net@FreeBSD.ORG Tue Apr 22 17:40:01 2008 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D1F67106564A; Tue, 22 Apr 2008 17:40:01 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (cl-162.ewr-01.us.sixxs.net [IPv6:2001:4830:1200:a1::2]) by mx1.freebsd.org (Postfix) with ESMTP id 6D87F8FC20; Tue, 22 Apr 2008 17:40:01 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (localhost [127.0.0.1]) by lor.one-eyed-alien.net (8.14.2/8.14.2) with ESMTP id m3MHeDxH033078; Tue, 22 Apr 2008 12:40:13 -0500 (CDT) (envelope-from brooks@lor.one-eyed-alien.net) Received: (from brooks@localhost) by lor.one-eyed-alien.net (8.14.2/8.14.2/Submit) id m3MHeDDv033077; Tue, 22 Apr 2008 12:40:13 -0500 (CDT) (envelope-from brooks) Date: Tue, 22 Apr 2008 12:40:13 -0500 From: Brooks Davis To: vijay singh Message-ID: <20080422174013.GA31959@lor.one-eyed-alien.net> References: <340035.20572.qm@web33502.mail.mud.yahoo.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="UugvWAfsgieZRqgk" Content-Disposition: inline In-Reply-To: <340035.20572.qm@web33502.mail.mud.yahoo.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (lor.one-eyed-alien.net [127.0.0.1]); Tue, 22 Apr 2008 12:40:13 -0500 (CDT) Cc: freebsd-net@FreeBSD.org, Brooks Davis , Robert Watson Subject: Re: Regarding if_alloc() X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Apr 2008 17:40:01 -0000 --UugvWAfsgieZRqgk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Apr 22, 2008 at 10:38:31AM -0700, vijay singh wrote: >=20 >=20 > ----- Original Message ---- > From: Robert Watson > To: Brooks Davis > Cc: vijay singh ; freebsd-net@freebsd.org > Sent: Sunday, April 20, 2008 2:27:15 AM > Subject: Re: Regarding if_alloc() >=20 >=20 > On Fri, 18 Apr 2008, Brooks Davis wrote: >=20 > > On Thu, Apr 17, 2008 at 06:35:23PM -0700, vijay singh wrote: > >> Hi all. How do we avoid a race in populating the ifindex_table? Id thi= s is=20 > >> a TODO, as it seems from the code below, would it be acceptable if I w= rote=20 > >> a patch and reused the ifnet_lock [IFNET_WLOCK, IFNET_WUNLOCK]? > > > > Locking if_index management with ifnet_lock should be ok. Ideally we s= hould=20 > > probably be using ALLOC_UNR(9) to manage if_indexes instead of this rat= her=20 > > expensive loop. > > > > Be aware, that if_index generation is least of the issues in this area.= The=20 > > if_grow() call is much riskier since it changes the value of the global= =20 > > ifnet pointer which I'm not sure we can afford to lock. It would be wo= rth=20 > > experimenting with rmlocks to see what the impact if of locking would b= e. > > > > I'm serious tempted to kill if_grow in favor of some sort of if_index_m= ax=20 > > tunable. >=20 > I've seen a number of reports of panics that may well be traceable to rac= es in=20 > if_index handling, and have looked a bit at possible fixes. Quite a few = uses=20 > of if_index are inherently racy, as they rely on stability of the index v= alue,=20 > which of course can't be guaranteed with removable interfaces. >=20 > I think a reasonable interim fix would be to protect all use of the byind= ex=20 > arrays using the ifnet lock, but remember that the read methods, not just= the=20 > write methods, need protection, and as such should move from being macros= in=20 > if_var.h to functions in if.c. if_grow is probably OK if this is done ri= ght,=20 > but it will need to be set up to drop the lock, grow, re-aquire, and=20 > re-validate assumptions (i.e., repeat the search for a free index and loo= p if=20 > it fails to find one). Once the read methods are using the lock also, we= =20 > should seriously consider converting it to an rwlock. We can probably al= so=20 > un-publicize at least one of the byindex lookup routines (the dev lookup,= =20 > which is needed only in if.c). >=20 > This would prevent races on modifying and evaluating the index array, but= not=20 > on disappearing cdevs and ifnets, which are a separate problem, and one t= hat=20 > probably is exercised significanty less. The reports I've seen appear to= have=20 > only to do with pulling the array out from other consumers while in use. >=20 >=20 > >> >=20 > Robert, I am working on the patch, but I had a few questions. If we > drop the lock while if_grow() is running, how do we prevent a second > caller, blocked in if_alloc() to start scanning the ifindex_table, and > end up deciding to grow it again. In other words, we need to stall > the ifindex_table scan in if_alloc() till if_grow() has achieved > finished. Since if_grow() will be called relatively infrequently, > should we consider holding the lock through the routine? Your comments > and suggestions are welcome. You can't hold locks over malloc calls. -- Brooks --UugvWAfsgieZRqgk Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (FreeBSD) iD8DBQFIDiL8XY6L6fI4GtQRApcjAKDTm1GAm9tEPH6EL/jolHaUEim7OwCguqWB UeYxLu9bgMC7YRaV8g772yo= =kqLE -----END PGP SIGNATURE----- --UugvWAfsgieZRqgk--