From owner-svn-src-all@FreeBSD.ORG Fri Jun 26 05:50:35 2009 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E0DC1065675; Fri, 26 Jun 2009 05:50:35 +0000 (UTC) (envelope-from sam@freebsd.org) Received: from ebb.errno.com (ebb.errno.com [69.12.149.25]) by mx1.freebsd.org (Postfix) with ESMTP id 1C4648FC08; Fri, 26 Jun 2009 05:50:35 +0000 (UTC) (envelope-from sam@freebsd.org) Received: from Macintosh-4.local ([10.0.0.200]) (authenticated bits=0) by ebb.errno.com (8.13.6/8.12.6) with ESMTP id n5Q5oQgH043442 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 25 Jun 2009 22:50:26 -0700 (PDT) (envelope-from sam@freebsd.org) Message-ID: <4A4461A2.4070005@freebsd.org> Date: Thu, 25 Jun 2009 22:50:26 -0700 From: Sam Leffler Organization: FreeBSD Project User-Agent: Thunderbird 2.0.0.22 (Macintosh/20090605) MIME-Version: 1.0 To: Andrew Gallatin References: <200906242109.n5OL9uVb029380@svn.freebsd.org> <4A42983C.6050307@freebsd.org> <4A42D629.8060806@cs.duke.edu> In-Reply-To: <4A42D629.8060806@cs.duke.edu> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-DCC-x.dcc-servers-Metrics: ebb.errno.com; whitelist Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r194909 - head/sys/dev/mxge X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Jun 2009 05:50:35 -0000 Andrew Gallatin wrote: > Sam Leffler wrote: > >> There's something else wrong. This is just covering up the real bug. > > I'm pretty sure the "real bug" is in bpf, but I'm not sure its a bug, > and I suspect there are probably other, similar, bugs lurking when > you try to tear down a busy interface. FWIW my point was that we need to stop adding "hacks" to drivers to workaround issues like this. There should be a standard way drivers are written to handle detach that avoids races. > > What I was doing was: > > - point a packet generator offering 1.5Mpps at the NIC > > - in a tight shell loop, do > > while (1) > tcpdump -ni mxge0 host 172.31.0.1 > end > > - in another shell loop: > > while (1) > ifconfig mxge0 192.168.1.22 up > sleep 1 > kldunload if_mxge > end > > Before the commit, with the old order: > > lock() > close() > unlock() > ether_ifdetach() > > I'd see either an exhaustion of mbufs because tcpdump snuck in after > I'd closed the device and re-opened it on me (so I never closed it > again, resulting in leaked mbufs), or a panic. > > I then moved the ether_ifdetach() to the new position: > ether_ifdetach() > lock() > close() > unlock() > > This worked great until I started the packet generator, > then it crashed. The stack I saw (which I don't have > saved, so this is from memory) when I had ether_ifdetach() > first was: > > > panic: mtx_lock() (don't remember exact text) > bpf_mtap() > ether_input() > mxge_rx_done_small() > mxge_clean_rx_done() > mxge_intr() > <...> > > When I looked at the ifp in kgdb, I noticed that all the operations > (if_input(), if_output(), etc) pointed to ifdead_* > The machine I'm using for this is a MacPro, and I can't get ddb > to work on the USB based console, so I'm working purely from dumps. > I don't know how to get a stack of another process in kgdb on > amd64, so that's all the information I have. > > My assumption is that my interrupt thread was running when > ether_ifdetach() called bpfdetach(), and was starting bpf_mtap() > while bpfdetach() was destroying the bpf_if. There doesn't > seem to be anything to prevent bpfdetach() from racing with > bpf_mtap(). > > By calling my close() routine (with a dying flag so nothing can > sneak in before detach), I'm assured that my NIC is quiescent, > and cannot be calling into the stack while the interface is being > torn down. I'd prefer to leave my commit as-is because: > > 1) it works, and fixes a bug > 2) it can be MFC'ed as is > 3) it just feels wrong to be blasting packets up into the stack > while detaching. With this NIC, the best way to make it > quiescent is to call close(). There's an interrupt handshake > done with the NIC to ensure its is quiescent, so doing something > like disabling its interrupt could leave the things in a weird state. I'm not asking you to revert your commit; what I want is a generic solution for all drivers so we can eliminate stuff like this. I don't care about MFC'ing at the moment; there have been issues with ifnet detach for a long time but with recent changes to the ifnet layer in HEAD I think it's time to work out issues like this. bpf has several problems (at the top of my list is how it holds a private mtx over ifpromisc calls which causes heartburn for drivers that need to block when processing such requests). It sounds like bpf needs to push tap'd packets through a method ifnet can "deaden" and/or bpf needs to explicitly handle this case. At that point you can eliminate your dying flag (as can some other drivers that I've seen recently grow similar hacks). Sam