From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 15:24:15 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 82BA4B13; Fri, 19 Oct 2012 15:24:15 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135]) by mx2.freebsd.org (Postfix) with ESMTP id 2BB123B4F81; Fri, 19 Oct 2012 15:24:13 +0000 (UTC) Message-ID: <50817057.3090200@FreeBSD.org> Date: Fri, 19 Oct 2012 19:23:03 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120627 Thunderbird/13.0.1 MIME-Version: 1.0 To: John Baldwin Subject: Re: ixgbe & if_igb RX ring locking References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <201210171006.51214.jhb@freebsd.org> In-Reply-To: <201210171006.51214.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 15:24:15 -0000 On 17.10.2012 18:06, John Baldwin wrote: > On Monday, October 15, 2012 9:04:27 am John Baldwin wrote: >> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: >>> On 13.10.2012 23:24, Jack Vogel wrote: >>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo wrote: >>> >>>>> >>>>> one option could be (same as it is done in the timer >>>>> routine in dummynet) to build a list of all the packets >>>>> that need to be sent to if_input(), and then call >>>>> if_input with the entire list outside the lock. >>>>> >>>>> It would be even easier if we modify the various *_input() >>>>> routines to handle a list of mbufs instead of just one. >>> >>> Bulk processing is generally a good idea we probably should implement. >>> Probably starting from driver queue ending with marked mbufs >>> (OURS/forward/legacy processing (appletalk and similar))? >>> >>> This can minimize an impact for all >>> locks on RX side: >>> L2 >>> * rx PFIL hook >>> L3 (both IPv4 and IPv6) >>> * global IF_ADDR_RLOCK (currently commented out) >>> * Per-interface ADDR_RLOCK >>> * PFIL hook >>> >>> From the first glance, there can be problems with: >>> * Increased latency (we should have some kind of rx_process_limit), but >>> still >>> * reader locks being acquired for much longer amount of time >>> >>>>> >>>>> cheers >>>>> luigi >>>>> >>>>> Very interesting idea Luigi, will have to get that some thought. >>>> >>>> Jack >>> >>> Returning to original post topic: >>> >>> Given >>> 1) we are currently binding ixgbe ithreads to CPU cores >>> 2) RX queue lock is used by (indirectly) in only 2 places: >>> a) ISR routine (msix or legacy irq) >>> b) taskqueue routine which is scheduled if some packets remains in RX >>> queue and rx_process_limit ended OR we need something to TX >>> >>> 3) in practice taskqueue routine is a nightmare for many people since >>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after >>> some traffic burst happens: once it is called it starts to schedule >>> itself more and more replacing original ISR routine. Additionally, >>> increasing rx_process_limit does not help since taskqueue is called with >>> the same limit. Finally, currently netisr taskq threads are not bound to >>> any CPU which makes the process even more uncontrollable. >> >> I think part of the problem here is that the taskqueue in ixgbe(4) is >> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should >> just start transmitting packets directly. >> >> I fixed this in igb(4) here: >> >> http://svnweb.freebsd.org/base?view=revision&revision=233708 >> >> You can try this for ixgbe(4). It also comments out a spurious taskqueue >> reschedule from the watchdog handler that might also lower the taskqueue >> usage. You can try changing that #if 0 to an #if 1 to test just the txeof >> changes: > > Is anyone able to test this btw to see if it improves things on ixgbe at all? > (I don't have any ixgbe hardware.) Yes. I'll try to to this next week (since ixgbe driver from at least 9-S fails to detect twinax cable which works in 8-S....)). > -- WBR, Alexander