Date: Wed, 24 Jun 2009 21:43:05 -0400 From: Andrew Gallatin <gallatin@cs.duke.edu> To: Sam Leffler <sam@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r194909 - head/sys/dev/mxge Message-ID: <4A42D629.8060806@cs.duke.edu> In-Reply-To: <4A42983C.6050307@freebsd.org> References: <200906242109.n5OL9uVb029380@svn.freebsd.org> <4A42983C.6050307@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Sam Leffler wrote: > There's something else wrong. This is just covering up the real bug. I'm pretty sure the "real bug" is in bpf, but I'm not sure its a bug, and I suspect there are probably other, similar, bugs lurking when you try to tear down a busy interface. What I was doing was: - point a packet generator offering 1.5Mpps at the NIC - in a tight shell loop, do while (1) tcpdump -ni mxge0 host 172.31.0.1 end - in another shell loop: while (1) ifconfig mxge0 192.168.1.22 up sleep 1 kldunload if_mxge end Before the commit, with the old order: lock() close() unlock() ether_ifdetach() I'd see either an exhaustion of mbufs because tcpdump snuck in after I'd closed the device and re-opened it on me (so I never closed it again, resulting in leaked mbufs), or a panic. I then moved the ether_ifdetach() to the new position: ether_ifdetach() lock() close() unlock() This worked great until I started the packet generator, then it crashed. The stack I saw (which I don't have saved, so this is from memory) when I had ether_ifdetach() first was: panic: mtx_lock() (don't remember exact text) bpf_mtap() ether_input() mxge_rx_done_small() mxge_clean_rx_done() mxge_intr() <...> When I looked at the ifp in kgdb, I noticed that all the operations (if_input(), if_output(), etc) pointed to ifdead_* The machine I'm using for this is a MacPro, and I can't get ddb to work on the USB based console, so I'm working purely from dumps. I don't know how to get a stack of another process in kgdb on amd64, so that's all the information I have. My assumption is that my interrupt thread was running when ether_ifdetach() called bpfdetach(), and was starting bpf_mtap() while bpfdetach() was destroying the bpf_if. There doesn't seem to be anything to prevent bpfdetach() from racing with bpf_mtap(). By calling my close() routine (with a dying flag so nothing can sneak in before detach), I'm assured that my NIC is quiescent, and cannot be calling into the stack while the interface is being torn down. I'd prefer to leave my commit as-is because: 1) it works, and fixes a bug 2) it can be MFC'ed as is 3) it just feels wrong to be blasting packets up into the stack while detaching. With this NIC, the best way to make it quiescent is to call close(). There's an interrupt handshake done with the NIC to ensure its is quiescent, so doing something like disabling its interrupt could leave the things in a weird state. Drew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A42D629.8060806>