From owner-freebsd-net@FreeBSD.ORG Fri Jan 18 17:13:15 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id ED58846D; Fri, 18 Jan 2013 17:13:15 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id C574488A; Fri, 18 Jan 2013 17:13:15 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 2FE61B986; Fri, 18 Jan 2013 12:13:15 -0500 (EST) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: two problems in dev/e1000/if_lem.c::lem_handle_rxtx() Date: Fri, 18 Jan 2013 11:49:42 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <1358519440.88044.YahooMailClassic@web121605.mail.ne1.yahoo.com> In-Reply-To: <1358519440.88044.YahooMailClassic@web121605.mail.ne1.yahoo.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301181149.42277.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 18 Jan 2013 12:13:15 -0500 (EST) Cc: Barney Cordoba , Adrian Chadd , Luigi Rizzo X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Jan 2013 17:13:16 -0000 On Friday, January 18, 2013 9:30:40 am Barney Cordoba wrote: > > --- On Thu, 1/17/13, Adrian Chadd wrote: > > > From: Adrian Chadd > > Subject: Re: two problems in dev/e1000/if_lem.c::lem_handle_rxtx() > > To: "Barney Cordoba" > > Cc: "Luigi Rizzo" , freebsd-net@freebsd.org > > Date: Thursday, January 17, 2013, 11:48 AM > > There's also the subtle race > > condition in TX and RX handling that > > re-queuing the taskqueue gets around. > > > > Which is: > > > > * The hardware is constantly receiving frames , right until > > you blow > > the FIFO away by filling it up; > > * The RX thread receives a bunch of frames; > > * .. and processes them; > > * .. once it's done processing, the hardware may have read > > some more > > frames in the meantime; > > * .. and the hardware may have generated a mitigated > > interrupt which > > you're ignoring, since you're processing frames; > > * So if your architecture isn't 100% paranoid, you may end > > up having > > to wait for the next interrupt to handle what's currently in > > the > > queue. > > > > Now if things are done correct: > > > > * The hardware generates a mitigated interrupt > > * The mask register has that bit disabled, so you don't end > > up receiving it; > > * You finish your RX queue processing, and there's more > > stuff that's > > appeared in the FIFO (hence why the hardware has generated > > another > > mitigated interrupt); > > * You unmask the interrupt; > > * .. and the hardware immediately sends you the MSI or > > signals an interrupt; > > * .. thus you re-enter the RX processing thread almost(!) > > immediately. > > > > However as the poster(s) have said, the interrupt > > mask/unmask in the > > intel driver(s) may not be 100% correct, so you're going to > > end up > > with situations where interrupts are missed. > > > > The reason why this wasn't a big deal in the deep/distant > > past is > > because we didn't used to have kernel preemption, or > > multiple kernel > > threads running, or an overly aggressive scheduler trying > > to > > parallelise things as much as possible. A lot of > > net80211/ath bugs > > have popped out of the woodwork specifically because of the > > above > > changes to the kernel. They were bugs before, but people > > didn't hit > > them. > > > > I don't see the distinction between the rx thread getting re-scheduled > "immediately" vs introducing another thread. In fact you increase missed > interrupts by this method. The entire point of interrupt moderation is > to tune the intervals where a driver is processed. > > You might as well just not have a work limit and process until your done. > The idea that "gee, I've been taking up too much cpu, I'd better yield" > to just queue a task and continue soon after doesn't make much sense to > me. If there are multiple threads with the same priority then batching the work up into chunks allows the scheduler to round-robin among them. However, when a task requeues itself that doesn't actually work since the taskqueue thread will see the requeued task before it yields the CPU. Alternatively, if you force all the relevant interrupt handlers to use the same thread pool and instead of requeueing a separate task you requeue your handler in the ithread pool then you can get the desired round-robin behavior. (I have changes to the ithread stuff that get us part of the way there in that handlers can reschedule themselves and much of the plumbing is in place for shared thread pools among different interrupts.) -- John Baldwin