From owner-freebsd-current@FreeBSD.ORG Thu Jul 23 00:29:58 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 93CC11065672; Thu, 23 Jul 2009 00:29:58 +0000 (UTC) (envelope-from sam@errno.com) Received: from ebb.errno.com (ebb.errno.com [69.12.149.25]) by mx1.freebsd.org (Postfix) with ESMTP id 45E928FC08; Thu, 23 Jul 2009 00:29:58 +0000 (UTC) (envelope-from sam@errno.com) Received: from ice.local ([10.0.0.115]) (authenticated bits=0) by ebb.errno.com (8.13.6/8.12.6) with ESMTP id n6N0Tv8J026845 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 22 Jul 2009 17:29:57 -0700 (PDT) (envelope-from sam@errno.com) Message-ID: <4A67AF05.7060504@errno.com> Date: Wed, 22 Jul 2009 17:29:57 -0700 From: Sam Leffler User-Agent: Thunderbird 2.0.0.22 (Macintosh/20090605) MIME-Version: 1.0 To: Giorgos Keramidas References: <87eis8g3b9.fsf@kobe.laptop> In-Reply-To: <87eis8g3b9.fsf@kobe.laptop> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-DCC-x.dcc-servers-Metrics: ebb.errno.com; whitelist Cc: freebsd-current@freebsd.org, Andrew Thompson Subject: Re: lagg0 and tcpdump problem X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Jul 2009 00:29:58 -0000 Giorgos Keramidas wrote: > When I run tcpdump on lagg0 (with em0 and iwn0 as laggports), tcpdump > seems to work fine, but typing ^C kills the wireless interface too. > > My /var/log/messages shows at the time: > > Jul 22 17:59:29 kobe kernel: --- syscall (6, FreeBSD ELF32, close), eip = 0x28393313, esp = 0xbfbfe78c, ebp = 0xbfbfe7a8 --- > Jul 22 17:59:29 kobe kernel: taskqueue_drain with the following non-sleepable locks held: > Jul 22 17:59:29 kobe kernel: exclusive rw if_lagg rwlock (if_lagg rwlock) r = 0 (0xcb651704) locked @ /usr/src/sys/modules/if_lagg/../../net/if_lagg.c:953 > Jul 22 17:59:29 kobe kernel: exclusive sleep mutex bpf global lock (bpf global lock) r = 0 (0xc0bc1e90) locked @ /usr/src/sys/net/bpf.c:605 > Jul 22 17:59:29 kobe kernel: KDB: stack backtrace: > Jul 22 17:59:29 kobe kernel: db_trace_self_wrapper(c09a567c,fba428ec,c06be155,c09b0bcd,25d,...) at db_trace_self_wrapper+0x26 > Jul 22 17:59:29 kobe kernel: kdb_backtrace(c09b0bcd,25d,ffffffff,c0b90604,fba42924,...) at kdb_backtrace+0x29 > Jul 22 17:59:29 kobe kernel: _witness_debugger(c09a7af7,fba42938,4,1,0,...) at _witness_debugger+0x25 > Jul 22 17:59:29 kobe kernel: witness_warn(5,0,c0961ec8,137,c673c85c,...) at witness_warn+0x1fd > Jul 22 17:59:29 kobe kernel: taskqueue_drain(c673c840,c678c0b8,d5821000,fba42994,c075a5fc,...) at taskqueue_drain+0xa9 > Jul 22 17:59:29 kobe kernel: ieee80211_waitfor_parent(c678c000,0,c09b6c55,caf,c678c000,...) at ieee80211_waitfor_parent+0x7b > Jul 22 17:59:29 kobe kernel: ieee80211_ioctl(d2633400,80206910,fba429b4,c6515748,8903,...) at ieee80211_ioctl+0x1ac > Jul 22 17:59:29 kobe kernel: if_setflag(d2633438,0,fba42a18,c06bdf9c,100,...) at if_setflag+0x10a > Jul 22 17:59:29 kobe kernel: ifpromisc(d2633400,0,c7e96a43,431,1,...) at ifpromisc+0x33 > Jul 22 17:59:29 kobe kernel: lagg_setflags(cb651704,c7e96a43,3b9,c09b09ad,c650e380,...) at lagg_setflags+0x84 > Jul 22 17:59:29 kobe kernel: lagg_ioctl(c7057800,80206910,fba42aec,fba42b1c,8903,...) at lagg_ioctl+0x50c > Jul 22 17:59:29 kobe kernel: if_setflag(c7057838,0,c09a0c3d,df,0,...) at if_setflag+0x10a > Jul 22 17:59:29 kobe kernel: ifpromisc(c7057800,0,c09b0bcd,236,c6551a4c,...) at ifpromisc+0x33 > Jul 22 17:59:29 kobe kernel: bpf_detachd(c0bc1e90,0,c09b0bcd,25d,d93e57a0,...) at bpf_detachd+0x249 > Jul 22 17:59:29 kobe kernel: bpf_dtor(ca46a100,0,c099768a,9e,c7789460,...) at bpf_dtor+0xb0 > Jul 22 17:59:29 kobe kernel: devfs_destroy_cdevpriv(d93e57a0,0,c099768a,a8,fba42be4,...) at devfs_destroy_cdevpriv+0xac > Jul 22 17:59:29 kobe kernel: devfs_fpdrop(c7789460,cd581b40,3,0,c7789460,...) at devfs_fpdrop+0x68 > Jul 22 17:59:29 kobe kernel: _fdrop(c7789460,cd581b40,fba42c18,c06bdf9c,0,cd581be4,c0b90600,c0a0afa0,c099cf22,cc5efa2c,45b,c099cf22,fba42c40,c0684440,cc5efa2c,8,c099cf22,45b) at _fdrop+0x53 > Jul 22 17:59:29 kobe kernel: closef(c7789460,cd581b40,45b,440,cc5efa2c,...) at closef+0x290 > Jul 22 17:59:29 kobe kernel: kern_close(cd581b40,3,fba42d2c,c0932863,cd581b40,...) at kern_close+0x117 > Jul 22 17:59:29 kobe kernel: close(cd581b40,fba42cf8,4,c099ed18,c0a01b68,...) at close+0x1a > Jul 22 17:59:29 kobe kernel: syscall(fba42d38) at syscall+0x2a3 > Jul 22 17:59:29 kobe kernel: Xint0x80_syscall() at Xint0x80_syscall+0x20 > Jul 22 17:59:29 kobe kernel: --- syscall (6, FreeBSD ELF32, close), eip = 0x28393313, esp = 0xbfbfe78c, ebp = 0xbfbfe7a8 --- > This is a known issue; bpf is holding a mutex over calls to the driver that may block (in this case the taskqueue_drain calls in net80211). It's unlikely to be resolved for 8.0 (too risky). > Then typing ^C stops tcpdump but the log shows: > > Jul 22 17:59:29 kobe kernel: wlan0: promiscuous mode disabled > Jul 22 17:59:29 kobe kernel: em0: promiscuous mode disabled > Jul 22 17:59:29 kobe kernel: iwn0: error, INTR=82000000 STATUS=0x40010000 > Jul 22 17:59:29 kobe kernel: lagg0: promiscuous mode disabled > Jul 22 17:59:30 kobe kernel: iwn0: iwn_transfer_firmware: timeout waiting for first alive notice, error 35 > Jul 22 17:59:30 kobe kernel: iwn0: iwn_init_locked: could not load firmware, error 35 > Jul 22 17:59:30 kobe kernel: wlan0: link state changed to DOWN > Jul 22 17:59:30 kobe kernel: lagg0: link state changed to DOWN > > At this point wlan0 is without carrier, and stays that way until I > unplumb wlan0 and lagg0 and re-create them. > > It seems that at this part of if_lagg.c we are locking the lagg softc, > but then we call lagg_setflags() -> lagg_setflag(): > > 953 LAGG_WLOCK(sc); > 954 SLIST_FOREACH(lp, &sc->sc_ports, lp_entries) { > 955 lagg_setflags(lp, 1); > 956 } > 957 LAGG_WUNLOCK(sc); > > but this vectors into the wlan code near if_lagg.c:line 1088. Does it > make sense to drop the exclusive lagg lock around the code to the port > flag changing code or would this introduce a silly race? > > %%% > --- a/sys/net/if_lagg.c Wed Jul 15 15:29:17 2009 +0300 > +++ b/sys/net/if_lagg.c Wed Jul 22 18:10:29 2009 +0300 > @@ -1085,7 +1085,9 @@ > * in accord with actual ports flags. > */ > if (status != (lp->lp_ifflags & flag)) { > + LAGG_WUNLOCK(sc); > error = (*func)(ifp, status); > + LAGG_WLOCK(sc); > if (error) > return (error); > lp->lp_ifflags &= ~flag; > %%% Sounds like iwn isn't reacting well to the calls coming in from lagg. wlandebug state should provide some insight. I've used lagg+iwn+em on a t61p with no obvious issues but never tried to run tcpdump on the lagg port. Sam