From owner-svn-src-head@FreeBSD.ORG Mon Jan 2 03:09:59 2012 Return-Path: Delivered-To: svn-src-head@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A65A01065672; Mon, 2 Jan 2012 03:09:59 +0000 (UTC) (envelope-from lstewart@freebsd.org) Received: from lauren.room52.net (lauren.room52.net [210.50.193.198]) by mx1.freebsd.org (Postfix) with ESMTP id 103558FC08; Mon, 2 Jan 2012 03:09:58 +0000 (UTC) Received: from lstewart1.loshell.room52.net (ppp59-167-184-191.static.internode.on.net [59.167.184.191]) by lauren.room52.net (Postfix) with ESMTPSA id 55E3C7E824; Mon, 2 Jan 2012 14:09:56 +1100 (EST) Message-ID: <4F012004.8010508@freebsd.org> Date: Mon, 02 Jan 2012 14:09:56 +1100 From: Lawrence Stewart User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:7.0.1) Gecko/20111016 Thunderbird/7.0.1 MIME-Version: 1.0 To: Adrian Chadd References: <201112300857.pBU8vxfP004914@svn.freebsd.org> <4EFEB38F.8010709@freebsd.org> In-Reply-To: <4EFEB38F.8010709@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY autolearn=unavailable version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on lauren.room52.net Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, rwatson@freebsd.org Subject: Re: svn commit: r228986 - in head: share/man/man4 sys/net X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jan 2012 03:09:59 -0000 On 12/31/11 18:02, Lawrence Stewart wrote: > On 12/31/11 11:13, Adrian Chadd wrote: >> This just broke wlan. Please consider fixing this patch :) >> >> >> Adrian >> >> *** Interface: wlan0: start >> can't re-use a leaf (wlan0)! >> panic: bpfattach tscfgoid >> KDB: enter: panic >> [ thread pid 166 tid 100048 ]Stopped at kdb_enter+0x4c: lui at,0x8048 >> db> >> db> bt >> Tracing pid 166 tid 100048 td 0x80c37900 >> db_trace_thread+30 (?,?,?,?) ra 80074cc0 sp c40075f0 sz 24 >> 80074bac+114 (0,?,ffffffff,?) ra 8007427c sp c4007608 sz 32 >> 80073ef4+388 (?,?,?,?) ra 80074400 sp c4007628 sz 168 >> db_command_loop+70 (?,?,?,?) ra 80076ac4 sp c40076d0 sz 24 >> 800769d0+f4 (?,?,?,?) ra 801b7560 sp c40076e8 sz 424 >> kdb_trap+110 (?,?,?,?) ra 8035c7ec sp c4007890 sz 48 >> trap+bf4 (?,?,?,?) ra 80354560 sp c40078c0 sz 184 >> MipsKernGenException+134 (0,4,803c089c,113) ra 801b72d8 sp c4007978 sz >> 200 >> kdb_enter+4c (?,?,?,?) ra 8017f88c sp c4007a40 sz 24 >> panic+11c (?,0,80706500,0) ra 802304b8 sp c4007a58 sz 40 >> bpfattach2+cc (?,?,?,?) ra 802305e0 sp c4007a80 sz 64 >> bpfattach+10 (?,?,?,?) ra 802448ec sp c4007ac0 sz 24 >> ether_ifattach+d0 (?,?,?,?) ra 8025f0ac sp c4007ad8 sz 32 >> ieee80211_vap_attach+104 (?,?,?,?) ra 8007ffe8 sp c4007af8 sz 96 >> 8007f960+688 (?,?,0,?) ra 8026a39c sp c4007b58 sz 88 >> 8026a22c+170 (?,?,?,?) ra 802430dc sp c4007bb0 sz 88 >> ifc_simple_create+80 (?,?,?,?) ra 80242b04 sp c4007c08 sz 64 >> 80242ab0+54 (?,?,?,?) ra 80242d78 sp c4007c48 sz 40 >> if_clone_create+a8 (?,?,?,?) ra 8023bdd4 sp c4007c70 sz 40 >> ifioctl+3a4 (?,?,80c7f460,80c37900) ra 801d8070 sp c4007c98 sz 144 >> soo_ioctl+3b0 (?,?,?,?) ra 801d2804 sp c4007d28 sz 40 >> kern_ioctl+248 (?,?,?,?) ra 801d29ac sp c4007d50 sz 64 >> sys_ioctl+130 (?,?,?,?) ra 8035c3ec sp c4007d90 sz 56 >> trap+7f4 (?,?,?,?) ra 8035475c sp c4007dc8 sz 184 >> MipsUserGenException+10c (?,?,?,4061d590) ra 0 sp c4007e80 sz 0 >> pid 166 >> db> reset > > I've managed to reproduce this on my wifi enabled laptop with r228986 > MFCed to a 9-stable kernel. I can't reproduce the panic with regular > interfaces, pseudo interfaces (e.g. lo0 and pflog0) or vlans (which also > have a relationship with an underlying physical interface) i.e. this is > VAP/net80211 specific as far as I can tell. > > The problem is that bpfattach2() is being called twice with the same > interface name i.e. "wlan0". > > I added a debug printf and call to kdb_backtrace() after the > SYSCTL_ADD_PROC() call in bpfattach2() to see the code paths which are > calling into the function. Here's what I see when the kernel runs on my > laptop (I've added inline comments between "## ##"): > > Added tscfg OID for interface wlan0 > KDB: stack backtrace: > #0 0xffffffff808680ce at kdb_backtrace+0x5e > #1 0xffffffff808da5b4 at bpfattach2+0xb4 > #2 0xffffffff808fbe06 at ieee80211_vap_setup+0x266 > #3 0xffffffff80740205 at wpi_vap_create+0x95 > #4 0xffffffff809047fb at wlan_clone_create+0x16b > #5 0xffffffff808e6079 at ifc_simple_create+0x89 > #6 0xffffffff808e5cc5 at if_clone_createif+0x65 > #7 0xffffffff808e4546 at ifioctl+0x306 > #8 0xffffffff80879755 at kern_ioctl+0x115 > #9 0xffffffff8087998d at sys_ioctl+0xfd > #10 0xffffffff80b17d60 at amd64_syscall+0x450 > #11 0xffffffff80b03497 at Xfast_syscall+0xf7 > > ## Here it has successfully added the net.bpf.tscfg.wlan0 sysctl entry > for wlan0 ## > > can't re-use a leaf (wlan0)! > > ## Here SYSCTL_ADD_PROC() failed because leaf name is already used ## > > panic: bpfattach tscfgoid > cpuid = 1 > KDB: stack backtrace: > #0 0xffffffff808680ce at kdb_backtrace+0x5e > #1 0xffffffff80832c87 at panic+0x187 > #2 0xffffffff808da6f6 at bpfattach2+0x1f6 > #3 0xffffffff808e72ee at ether_ifattach+0xae > #4 0xffffffff808fd0a5 at ieee80211_vap_attach+0xb5 > #5 0xffffffff8074023c at wpi_vap_create+0xcc > #6 0xffffffff809047fb at wlan_clone_create+0x16b > #7 0xffffffff808e6079 at ifc_simple_create+0x89 > #8 0xffffffff808e5cc5 at if_clone_createif+0x65 > #9 0xffffffff808e4546 at ifioctl+0x306 > #10 0xffffffff80879755 at kern_ioctl+0x115 > #11 0xffffffff8087998d at sys_ioctl+0xfd > #12 0xffffffff80b17d60 at amd64_syscall+0x450 > #13 0xffffffff80b03497 at Xfast_syscall+0xf7 > > > > So after a bit of digging, ieee80211_vap_setup() calls > ieee80211_radiotap_vattach(), which explicitly calls bpfattach2(), and > then a subsequent call to ieee80211_vap_attach() indirectly calls into > bpfattach2() via the call to ether_ifattach(). > > This smells like a net80211 bug to me. I'm guessing, but I would suspect > ieee80211_vap_setup() shouldn't call ieee80211_radiotap_vattach() and > should let ieee80211_vap_attach() handle the BPF attachment via the call > to ether_ifattach(). > > Thoughts? This turns out not to be a bug in net80211, but rather an oversight on my part. I didn't consider the case where multiple DLTs can be attached to BPF with the same interface name. I reverted r228986 and have reworked the patch to instead store the per-interface time stamp configuration OID pointer and config variable in the ifnet. In testing my new patch, I found another problem. BPF maintains a list (bpf_iflist) of bpf_if structs, one per ifnet/dlt combo. Each call to bpf_attach() prepends a new bpf_if struct to the list. bpfdetach() is expected to reclaim all bpf_if structs which reference the specified ifnet from the list when it is called, but currently only removes the first it finds. As far as I can tell, the implementation of bpfdetach() leaks bpf_if references, and has done so since it was introduced in r58273. See here for the function added in r58273: http://svnweb.freebsd.org/base/head/sys/net/bpf.c?revision=58273&view=markup#l1311 I would like to commit something like the following as a fix: http://people.freebsd.org/~lstewart/patches/misc/bpfdetach_bpfif_leakfix_10.x.r229165.patch With the above patch and my revised r228986 patch both applied, I can no longer produce panics on my laptop. Robert, as you committed r58273, I'm hoping you might have some thoughts on this (even if it's almost 12 years later ;). Am I missing something subtle, or is my analysis sensible? Assuming my analysis and patch are sane, would you mind giving it a quick review so I can commit it? Adrian, would you mind applying both the above bpf leakfix patch followed by the revised sysclocksnap/tscfg patch below and testing to make sure your kernel no longer explodes on boot? http://people.freebsd.org/~lstewart/patches/misc/bpf_sysclocksnap_tscfg_postleakfix_10.x.r229165.patch Cheers, Lawrence