From owner-freebsd-bugs@FreeBSD.ORG Sun Jan 22 19:50:10 2012 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A3FDB1065750 for ; Sun, 22 Jan 2012 19:50:10 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 79C508FC08 for ; Sun, 22 Jan 2012 19:50:10 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q0MJoA92000732 for ; Sun, 22 Jan 2012 19:50:10 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q0MJoAhg000731; Sun, 22 Jan 2012 19:50:10 GMT (envelope-from gnats) Resent-Date: Sun, 22 Jan 2012 19:50:10 GMT Resent-Message-Id: <201201221950.q0MJoAhg000731@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Adrian Chadd Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58A6C1065676 for ; Sun, 22 Jan 2012 19:45:30 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id 32BEF8FC23 for ; Sun, 22 Jan 2012 19:45:30 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q0MJjUZE046359 for ; Sun, 22 Jan 2012 19:45:30 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.4/8.14.4/Submit) id q0MJjUjI046358; Sun, 22 Jan 2012 19:45:30 GMT (envelope-from nobody) Message-Id: <201201221945.q0MJjUjI046358@red.freebsd.org> Date: Sun, 22 Jan 2012 19:45:30 GMT From: Adrian Chadd To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: misc/164382: [ath] crash when down/deleting a vap - inside ieee80211_input_mimo_all() X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jan 2012 19:50:10 -0000 >Number: 164382 >Category: misc >Synopsis: [ath] crash when down/deleting a vap - inside ieee80211_input_mimo_all() >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Jan 22 19:50:10 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Adrian Chadd >Release: 9.0-RC2, with -HEAD ath/net80211 >Organization: FreeBSD >Environment: FreeBSD marilyn 9.0-RC3-p1 FreeBSD 9.0-RC3-p1 #4: Sat Jan 21 20:56:40 PST 2012 root@marilyn:/usr/src/sys/i386/compile/MARILYN i386 >Description: I saw a crash inside the net80211 stack when either deleting or down'ing a vap. Unread portion of the kernel message buffer: KDB: stack backtrace: #0 0xc0727697 at kdb_backtrace+0x47 #1 0xc073b675 at _witness_debugger+0x25 #2 0xc073cb8e at witness_warn+0x1fe #3 0xc095e465 at trap+0x195 #4 0xc09478ac at calltrap+0x6 #5 0xc77e2bf1 at ieee80211_free_node_debug+0xb1 #6 0xc77ce017 at ieee80211_input_mimo_all+0xe7 #7 0xc77cdf22 at ieee80211_input_all+0x32 #8 0xc784dcc5 at ath_rx_proc+0xc45 #9 0xc784d071 at ath_rx_tasklet+0x101 #10 0xc073446b at taskqueue_run_locked+0xeb #11 0xc0734ec7 at taskqueue_thread_loop+0x67 #12 0xc06c76b8 at fork_exit+0xb8 #13 0xc0947924 at fork_trampoline+0x8 The debugging indicated something rather amusing at this point. ath0: ath_node_alloc: an 0xc7adf000 ieee80211_ref_node: 0xc7adf000: ieee80211_reset_bss /usr/home/adrian/work/freebsd/ath/head/src/sys/modules/wl an/../../net80211/ieee80211_node.c:434 wlan0: Ethernet address: 00:03:7f:11:a3:f3 ath0: ath_init: if_flags 0x8803 ath0: ath_stop_locked: invalid 0 if_flags 0x8803 ath0: ath_newstate: INIT -> INIT ath0: ath_newstate: RX filter 0x6497 bssid 00:00:00:00:00:00 aid 0x0 ath0: ath_newstate: INIT -> SCAN ath0: ath_newstate: RX filter 0x6497 bssid 00:00:00:00:00:00 aid 0x0 ath0: ath_node_alloc: an 0xc7aea000 . now at this point, there are two sets of messages which overlap, indicating that they ran concurrently: ieee80211_ref_node: 0xc7aea000: ieee80211_create_ibss /usr/home/adrian/work/freebsd/ath/head/src/sys/modules/ wlan/../../net802 ieee80211_ref_node: 0xc7adf000: ieee80211_input_mimo_all /usr/home/adrian/work/freebsd/ath/head/src/sys/modul es/wlan/../../net 11/ieee80211_node.c:412 80211/ieee80211_input.c:143 ath0: ath_node_free: ni 0xc7adf000 . and bang: Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex ath0_node_lock (ath0_node_lock) r = 0 (0xc79316c0) locked @ /usr/home/adrian/work/freeb sd/ath/head/src/sys/modules/wlan/../../net80211/ieee80211_node.c:1702 I'm gathering here that the delete was ongoing whilst traffic was being processed via ath_rx_tasklet() and the underlying vap was either deleted or the vap->iv_bss node was changed. There seems to be a larger class of bugs where the vap->iv_bss node is changed in parallel with some other process (eg beacon free/alloc) without suitable locking. >How-To-Repeat: It's difficult to reproduce. I reproduced it in a lab environment with lots of busy air. I guess anything that triggers constant incoming traffic and keeps the RX queue deep is going to make triggering this bug. What needs to happen: * ath_rx_tasklet() needs to take a while to run; * the ifconfig process (and net80211 taskqueue) needs to be scheduled on another CPU, so it can run _in parallel_ with the ath taskqueue (which ath_rx_tasklet() runs in) * somehow you have to get a vap down/delete in during this RX. >Fix: I think the RX path should be properly aborted during a a vap down/delete. This doesn't just mean stopping the hardware (which is what ath_stop_locked() currently does) but also waiting for the ath_rx_tasklet() and the TX completion tasklet to complete. >Release-Note: >Audit-Trail: >Unformatted: