From owner-freebsd-net@FreeBSD.ORG Mon Jan 28 21:27:30 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DD7088DB for ; Mon, 28 Jan 2013 21:27:30 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 94106D2C for ; Mon, 28 Jan 2013 21:27:30 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id B5713B946; Mon, 28 Jan 2013 16:27:29 -0500 (EST) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: two problems in dev/e1000/if_lem.c::lem_handle_rxtx() Date: Mon, 28 Jan 2013 16:25:15 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301281625.16157.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 28 Jan 2013 16:27:29 -0500 (EST) Cc: Luigi Rizzo , Jack Vogel , head@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jan 2013 21:27:30 -0000 On Tuesday, January 15, 2013 8:23:24 pm Luigi Rizzo wrote: > Hi, > i found a couple of problems in > dev/e1000/if_lem.c::lem_handle_rxtx() , > (compare with dev/e1000/if_em.c::em_handle_que() for better understanding): > > 1. in if_em.c::em_handle_que(), when em_rxeof() exceeds the > rx_process_limit, the task is rescheduled so it can complete the work. > Conversely, in if_lem.c::lem_handle_rxtx() the lem_rxeof() is > only run once, and if there are more pending packets the only > chance to drain them is to receive (many) more interrupts. > > This is a relatively serious problem, because the receiver has > a hard time recovering. > > I'd like to commit a fix to this same as it is done in e1000. This seems sensible. > 2. in if_em.c::em_handle_que(), interrupts are reenabled unconditionally, > whereas lem_handle_rxtx() only enables them if IFF_DRV_RUNNING is set. > > I cannot really tell what is the correct way here, so I'd like > to put a comment there unless there is a good suggestion on > what to do. > > Accesses to the intr register are race-prone anyways > (disabled in fastintr, enabled in the rxtx task without > holding any lock, and generally accessed under EM_CORE_LOCK > in other places), and presumably enabling/disabling the > interrupts around activations of the taks is just an > optimization (and on a VM, it is actually a pessimization > due to the huge cost of VM exits). Actually, this is quite important. The reason being that you don't want the interrupt handler and the task both running at the same time (ever). If you do they will both process RX packets and deliver them in different order to the stack wreaking havoc on TCP. In the case of what lem is doing, it is not racy (except with ifconfig up/down which is handled by checking for IFF_DRV_RUNNING changing after reacquiring the RX lock) as the algorith, is that once the fast interrupt handler masks the interrupt, that code path cannot run again until either the threaded handler or the task re-enables interrupts. -- John Baldwin