From owner-svn-src-all@FreeBSD.ORG Fri Dec 23 03:59:50 2011 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 615F5106564A; Fri, 23 Dec 2011 03:59:50 +0000 (UTC) (envelope-from adrian@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 4C1568FC14; Fri, 23 Dec 2011 03:59:50 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id pBN3xoA9047396; Fri, 23 Dec 2011 03:59:50 GMT (envelope-from adrian@svn.freebsd.org) Received: (from adrian@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id pBN3xoFh047394; Fri, 23 Dec 2011 03:59:50 GMT (envelope-from adrian@svn.freebsd.org) Message-Id: <201112230359.pBN3xoFh047394@svn.freebsd.org> From: Adrian Chadd Date: Fri, 23 Dec 2011 03:59:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r228832 - head/sys/dev/ath X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Dec 2011 03:59:50 -0000 Author: adrian Date: Fri Dec 23 03:59:49 2011 New Revision: 228832 URL: http://svn.freebsd.org/changeset/base/228832 Log: Rework this ugly mess that tries to handle reset serialisation. Some users were reporting concurrent resets _were_ occuring - ie, either two ath_reset()s ran at the same time (likely one on each CPU) or ath_reset() versus ath_chan_change(). Instead, this now tries to grab the serialisation semaphore and will pause() for a while if it fails. It will always eventually succeed though and will log an error if it hits the recursion situation. All of this stuff needs to die a horrible death at some point and be replaced with a properly serialising method of programming this stuff (eg using the net80211 taskqueue for all of this stuff.) The trouble is figuring out how to handle the concurrent ioctl() based things without introducing more LORs (which is another reason why I haven't just wrapped all of this stuff in large, long-lived locks, a-la what Linux can get away with.) MFC after: Absolutely, positively never. Modified: head/sys/dev/ath/if_ath.c Modified: head/sys/dev/ath/if_ath.c ============================================================================== --- head/sys/dev/ath/if_ath.c Fri Dec 23 02:57:18 2011 (r228831) +++ head/sys/dev/ath/if_ath.c Fri Dec 23 03:59:49 2011 (r228832) @@ -1898,6 +1898,70 @@ ath_txrx_start(struct ath_softc *sc) taskqueue_unblock(sc->sc_tq); } +/* + * Grab the reset lock, and wait around until noone else + * is trying to do anything with it. + * + * This is totally horrible but we can't hold this lock for + * long enough to do TX/RX or we end up with net80211/ip stack + * LORs and eventual deadlock. + * + * "dowait" signals whether to spin, waiting for the reset + * lock count to reach 0. This should (for now) only be used + * during the reset path, as the rest of the code may not + * be locking-reentrant enough to behave correctly. + * + * Another, cleaner way should be found to serialise all of + * these operations. + */ +#define MAX_RESET_ITERATIONS 10 +static int +ath_reset_grablock(struct ath_softc *sc, int dowait) +{ + int w = 0; + int i = MAX_RESET_ITERATIONS; + + ATH_PCU_LOCK_ASSERT(sc); + do { + if (sc->sc_inreset_cnt == 0) { + w = 1; + break; + } + if (dowait == 0) { + w = 0; + break; + } + ATH_PCU_UNLOCK(sc); + pause("ath_reset_grablock", 1); + i--; + ATH_PCU_LOCK(sc); + } while (i > 0); + + /* + * We always increment the refcounter, regardless + * of whether we succeeded to get it in an exclusive + * way. + */ + sc->sc_inreset_cnt++; + + if (i <= 0) + device_printf(sc->sc_dev, + "%s: didn't finish after %d iterations\n", + __func__, MAX_RESET_ITERATIONS); + + if (w == 0) + device_printf(sc->sc_dev, + "%s: warning, recursive reset path!\n", + __func__); + + return w; +} +#undef MAX_RESET_ITERATIONS + +/* + * XXX TODO: write ath_reset_releaselock + */ + static void ath_stop(struct ifnet *ifp) { @@ -1926,18 +1990,15 @@ ath_reset(struct ifnet *ifp, ATH_RESET_T DPRINTF(sc, ATH_DEBUG_RESET, "%s: called\n", __func__); - /* XXX ensure ATH_LOCK isn't held; ath_rx_proc can't be locked */ + /* Ensure ATH_LOCK isn't held; ath_rx_proc can't be locked */ ATH_PCU_UNLOCK_ASSERT(sc); ATH_UNLOCK_ASSERT(sc); ATH_PCU_LOCK(sc); - /* XXX if we're already inside a reset, print out a big warning */ - if (sc->sc_inreset_cnt > 0) { - device_printf(sc->sc_dev, - "%s: concurrent ath_reset()! Danger!\n", + if (ath_reset_grablock(sc, 1) == 0) { + device_printf(sc->sc_dev, "%s: concurrent reset! Danger!\n", __func__); } - sc->sc_inreset_cnt++; ath_hal_intrset(ah, 0); /* disable interrupts */ ATH_PCU_UNLOCK(sc); @@ -5250,10 +5311,10 @@ ath_chan_set(struct ath_softc *sc, struc /* Treat this as an interface reset */ ATH_PCU_LOCK(sc); - if (sc->sc_inreset_cnt > 0) - device_printf(sc->sc_dev, "%s: danger! concurrent reset!\n", + if (ath_reset_grablock(sc, 1) == 0) { + device_printf(sc->sc_dev, "%s: concurrent reset! Danger!\n", __func__); - sc->sc_inreset_cnt++; + } if (chan != sc->sc_curchan) { dointr = 1; /* XXX only do this if inreset_cnt is 1? */