Date: Thu, 17 Jan 2013 08:48:25 -0800 From: Adrian Chadd <adrian@freebsd.org> To: Barney Cordoba <barney_cordoba@yahoo.com> Cc: freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it> Subject: Re: two problems in dev/e1000/if_lem.c::lem_handle_rxtx() Message-ID: <CAJ-VmonmUrFXwUZ0A=yYYzoJbjCvHKuo%2BTEtTwFP5D7RGnKmKA@mail.gmail.com> In-Reply-To: <1358438932.56236.YahooMailClassic@web121601.mail.ne1.yahoo.com> References: <20130117025502.GA57613@onelab2.iet.unipi.it> <1358438932.56236.YahooMailClassic@web121601.mail.ne1.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
There's also the subtle race condition in TX and RX handling that re-queuing the taskqueue gets around. Which is: * The hardware is constantly receiving frames , right until you blow the FIFO away by filling it up; * The RX thread receives a bunch of frames; * .. and processes them; * .. once it's done processing, the hardware may have read some more frames in the meantime; * .. and the hardware may have generated a mitigated interrupt which you're ignoring, since you're processing frames; * So if your architecture isn't 100% paranoid, you may end up having to wait for the next interrupt to handle what's currently in the queue. Now if things are done correct: * The hardware generates a mitigated interrupt * The mask register has that bit disabled, so you don't end up receiving it; * You finish your RX queue processing, and there's more stuff that's appeared in the FIFO (hence why the hardware has generated another mitigated interrupt); * You unmask the interrupt; * .. and the hardware immediately sends you the MSI or signals an interrupt; * .. thus you re-enter the RX processing thread almost(!) immediately. However as the poster(s) have said, the interrupt mask/unmask in the intel driver(s) may not be 100% correct, so you're going to end up with situations where interrupts are missed. The reason why this wasn't a big deal in the deep/distant past is because we didn't used to have kernel preemption, or multiple kernel threads running, or an overly aggressive scheduler trying to parallelise things as much as possible. A lot of net80211/ath bugs have popped out of the woodwork specifically because of the above changes to the kernel. They were bugs before, but people didn't hit them. Adrian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmonmUrFXwUZ0A=yYYzoJbjCvHKuo%2BTEtTwFP5D7RGnKmKA>