Date: Sat, 27 Apr 2013 19:52:16 -0700 From: Adrian Chadd <adrian@freebsd.org> To: freebsd-wireless@freebsd.org Subject: net80211 and lock-ordering issues.. Message-ID: <CAJ-VmomP-j_zF8jPJfeCnMP5sLRZFh3ee5qkib_doh1crt-A_Q@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi, So my recent TX path locking work in net80211 has had time to bake a bit and I've logged all the LORs I've seen in STA/AP modes. Unfortunately there's a lot. I'm going to try and fix the ones I've seen so far. I'm worried though that there's some fundamental design issues here which are going to take a _long_ time to fix up. I'd like some design suggestions please. I'm tempted here to move the ic and vap tx paths over to run out of taskqueues (as I originally planned) and just bite the overhead for now. it just simplifies this locking a lot and will make it much easier to tidy up the rest of the code. Eg, if a buffer can't be sent and it's freed as part of the TX path: Apr 25 01:48:38 lucy kernel: lock order reversal: Apr 25 01:48:38 lucy kernel: 1st 0xcf2096e0 ath0 TX lock (ath0 TX lock) @ sys/dev/ath/if_ath_misc.h:135 Apr 25 01:48:38 lucy kernel: 2nd 0xcf22b6f4 ath0_node_lock (ath0_node_lock) @ /usr/home/adrian/work/freebsd/net80211_tx/head/src/sys/modules/wlan/../../net80211/ieee80211_node.c:1768 Apr 25 01:48:38 lucy kernel: KDB: stack backtrace: Apr 25 01:48:38 lucy kernel: #0 0xc0734d1f at kdb_backtrace+0x4f Apr 25 01:48:38 lucy kernel: #1 0xc074aa85 at _witness_debugger+0x25 Apr 25 01:48:38 lucy kernel: #2 0xc074bd7f at witness_checkorder+0x86f Apr 25 01:48:38 lucy kernel: #3 0xc06e9de4 at _mtx_lock_flags+0xc4 Apr 25 01:48:38 lucy kernel: #4 0xcf19e0d6 at ieee80211_free_node_debug+0xa6 Apr 25 01:48:38 lucy kernel: #5 0xd0b2aadd at ath_start+0x53d Apr 25 01:48:38 lucy kernel: #6 0xd0b299fb at ath_tx_kick+0x3b Apr 25 01:48:38 lucy kernel: #7 0xd0b2a19f at ath_start_queue+0x8f Apr 25 01:48:38 lucy kernel: #8 0xc07b6c42 at if_start+0x12 Apr 25 01:48:38 lucy kernel: #9 0xc07b750f at if_transmit+0x13f Apr 25 01:48:38 lucy kernel: #10 0xcf185d5c at ieee80211_parent_transmit+0x4c Apr 25 01:48:38 lucy kernel: #11 0xcf1a0ffa at ieee80211_start_pkt+0x74a Apr 25 01:48:38 lucy kernel: #12 0xcf1a1515 at ieee80211_start+0x335 Apr 25 01:48:38 lucy kernel: #13 0xc07b6c42 at if_start+0x12 Apr 25 01:48:38 lucy kernel: #14 0xc07b750f at if_transmit+0x13f Apr 25 01:48:38 lucy kernel: #15 0xc1104d41 at bridge_enqueue+0x31 Apr 25 01:48:38 lucy kernel: #16 0xc11079db at bridge_forward+0x2eb Apr 25 01:48:38 lucy kernel: #17 0xc1107f77 at bridge_input+0x527 My power save work also has introduced some. .. if the TIM bit is fiddled with: Apr 25 02:58:44 lucy kernel: lock order reversal: Apr 25 02:58:44 lucy kernel: 1st 0xcf2096e0 ath0 TX lock (ath0 TX lock) @ /usr/home/adrian/work/freebsd/net80211_tx/head/src/sys/modules/ath/../../dev/ath/if_ath.c:3796 Apr 25 02:58:44 lucy kernel: 2nd 0xcf22a014 ath0_com_lock (ath0_com_lock) @ /usr/home/adrian/work/freebsd/net80211_tx/head/src/sys/modules/wlan/../../net80211/ieee80211_power.c:297 Apr 25 02:58:44 lucy kernel: KDB: stack backtrace: Apr 25 02:58:44 lucy kernel: #0 0xc0734d1f at kdb_backtrace+0x4f Apr 25 02:58:44 lucy kernel: #1 0xc074aa85 at _witness_debugger+0x25 Apr 25 02:58:44 lucy kernel: #2 0xc074bd7f at witness_checkorder+0x86f Apr 25 02:58:44 lucy kernel: #3 0xc06e9de4 at _mtx_lock_flags+0xc4 Apr 25 02:58:44 lucy kernel: #4 0xcf1a8e18 at ieee80211_set_tim+0xd8 Apr 25 02:58:44 lucy kernel: #5 0xd0b32b5d at ath_tx_update_tim+0x20d Apr 25 02:58:44 lucy kernel: #6 0xd0b2d0c4 at ath_tx_default_comp+0xe4 Apr 25 02:58:44 lucy kernel: #7 0xd0b430ea at ath_tx_aggr_retry_unaggr+0x25a Apr 25 02:58:44 lucy kernel: #8 0xd0b44b99 at ath_tx_aggr_comp_unaggr+0x519 Apr 25 02:58:44 lucy kernel: #9 0xd0b44d6b at ath_tx_aggr_comp+0x4b Apr 25 02:58:44 lucy kernel: #10 0xd0b2d313 at ath_tx_process_buf_completion+0x123 Apr 25 02:58:44 lucy kernel: #11 0xd0b2d905 at ath_tx_processq+0x5e5 Apr 25 02:58:44 lucy kernel: #12 0xd0b2df20 at ath_tx_proc_q0123+0x170 Apr 25 02:58:44 lucy kernel: #13 0xc07435bb at taskqueue_run_locked+0xeb Apr 25 02:58:44 lucy kernel: #14 0xc0744027 at taskqueue_thread_loop+0x67 Apr 25 02:58:44 lucy kernel: #15 0xc06cb5c2 at fork_exit+0x112 Apr 25 02:58:44 lucy kernel: #16 0xc09715e4 at fork_trampoline+0x8 And some existing ones: Apr 25 03:04:05 lucy kernel: lock order reversal: Apr 25 03:04:05 lucy kernel: 1st 0xcf22b6f4 ath0_node_lock (ath0_node_lock) @ /usr/home/adrian/work/freebsd/net80211_tx/head/src/sys/modules/wlan/../../net80211/ieee80211_ioctl.c:1341 Apr 25 03:04:05 lucy kernel: 2nd 0xcf22a038 ath0_tx_lock (ath0_tx_lock) @ /usr/home/adrian/work/freebsd/net80211_tx/head/src/sys/modules/wlan/../../net80211/ieee80211_output.c:719 Apr 25 03:04:05 lucy kernel: KDB: stack backtrace: Apr 25 03:04:05 lucy kernel: #0 0xc0734d1f at kdb_backtrace+0x4f Apr 25 03:04:05 lucy kernel: #1 0xc074aa85 at _witness_debugger+0x25 Apr 25 03:04:05 lucy kernel: #2 0xc074bd7f at witness_checkorder+0x86f Apr 25 03:04:05 lucy kernel: #3 0xc06e9de4 at _mtx_lock_flags+0xc4 Apr 25 03:04:05 lucy kernel: #4 0xcf1a20be at ieee80211_mgmt_output+0x24e Apr 25 03:04:05 lucy kernel: #5 0xcf1a6043 at ieee80211_send_mgmt+0xd03 Apr 25 03:04:05 lucy kernel: #6 0xcf18c55f at domlme+0x8f Apr 25 03:04:05 lucy kernel: #7 0xcf18c64e at setmlme_dropsta+0xae Apr 25 03:04:05 lucy kernel: #8 0xcf18c7ab at setmlme_common+0xeb Apr 25 03:04:05 lucy kernel: #9 0xcf18cf62 at ieee80211_ioctl_setmlme+0x112 Apr 25 03:04:05 lucy kernel: #10 0xcf18fc14 at ieee80211_ioctl_set80211+0x7f4 Apr 25 03:04:05 lucy kernel: #11 0xcf19130e at ieee80211_ioctl+0x25e Apr 25 03:04:05 lucy kernel: #12 0xc07d66a9 at in_control+0x1e9 Apr 25 03:04:05 lucy kernel: #13 0xc07bd93f at ifioctl+0x1a7f Apr 25 03:04:05 lucy kernel: #14 0xc0754655 at soo_ioctl+0x415 Apr 25 03:04:05 lucy kernel: #15 0xc074e6ad at kern_ioctl+0x21d Apr 25 03:04:05 lucy kernel: #16 0xc074e834 at sys_ioctl+0x134 Apr 25 03:04:05 lucy kernel: #17 0xc0987e80 at syscall+0x380 There's also some LORs with the bridging code: Apr 25 17:31:59 lucy kernel: lock order reversal: Apr 25 17:31:59 lucy kernel: 1st 0xcecb56f4 ath0_node_lock (ath0_node_lock) @ /usr/home/adrian/work/freebsd/net80211_tx/head/src/sys/modules/wlan/../../net80211/ieee80211_node.c:1416 Apr 25 17:31:59 lucy kernel: 2nd 0xcef0460c if_bridge (if_bridge) @ /usr/home/adrian/work/freebsd/stable/9/sys/modules/if_bridge/../../net/if_bridge.c:2211 Apr 25 17:31:59 lucy kernel: KDB: stack backtrace: Apr 25 17:31:59 lucy kernel: #0 0xc0734d1f at kdb_backtrace+0x4f Apr 25 17:31:59 lucy kernel: #1 0xc074aa85 at _witness_debugger+0x25 Apr 25 17:31:59 lucy kernel: #2 0xc074bd7f at witness_checkorder+0x86f Apr 25 17:31:59 lucy kernel: #3 0xc06e9de4 at _mtx_lock_flags+0xc4 Apr 25 17:31:59 lucy kernel: #4 0xc1107ab4 at bridge_input+0x64 Apr 25 17:31:59 lucy kernel: #5 0xc07bff64 at ether_nh_input+0x324 Apr 25 17:31:59 lucy kernel: #6 0xc07c355c at netisr_dispatch_src+0xcc Apr 25 17:31:59 lucy kernel: #7 0xc07c37d0 at netisr_dispatch+0x20 Apr 25 17:31:59 lucy kernel: #8 0xc07bf645 at ether_input+0x35 Apr 25 17:31:59 lucy kernel: #9 0xcf1c2c3d at hostap_deliver_data+0x2cd Apr 25 17:31:59 lucy kernel: #10 0xcf1c38ca at hostap_input+0xc6a Apr 25 17:31:59 lucy kernel: #11 0xcf1b6214 at ampdu_dispatch+0x44 Apr 25 17:31:59 lucy kernel: #12 0xcf1b62a4 at ampdu_rx_flush+0x84 Apr 25 17:31:59 lucy kernel: #13 0xcf1b707b at ieee80211_ht_node_age+0xab Apr 25 17:31:59 lucy kernel: #14 0xcf19c7ef at node_age+0x9f Apr 25 17:31:59 lucy kernel: #15 0xcf19ec8a at ieee80211_timeout_stations+0x21a Apr 25 17:31:59 lucy kernel: #16 0xcf19eff9 at ieee80211_node_timeout+0x39 Apr 25 17:31:59 lucy kernel: #17 0xc0713a59 at softclock+0x369
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmomP-j_zF8jPJfeCnMP5sLRZFh3ee5qkib_doh1crt-A_Q>