From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 15:48:03 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9ADE129E; Fri, 19 Oct 2012 15:48:03 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id E74748FC12; Fri, 19 Oct 2012 15:48:02 +0000 (UTC) Received: by mail-vc0-f182.google.com with SMTP id fw7so820958vcb.13 for ; Fri, 19 Oct 2012 08:48:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LyQpCVIq+rlEOBpqfsniQevQqr2BzqLmlbe5ZKcEjGI=; b=XqPrwuIvrS2KLuRGcw8wwZnWDT4JsuYhRRsytdSrxFRmyh7v4Hji0Q7neaWOSeqJX/ DtAl54DqTKVLqbjQ7QXeC1MHT6L+ic6ors2/TXpJGtS2kjHT2VCoee9VvittH3vPIU1z P3hFxgO9+5m3XEVHjMC1m5WYbaNj/RN9g3HSNeMSXVKTdt6epgf5aB1CZwcXGJaRshYf 6GKgUmSElKrLtyBvJJkT4PRSUAO60A58TZC54BJpchK5XGIK4BNx7vhWJVf0PchM0rsB 8v/lN+wTHmqtUoxzz7rbdj6Jewq6h0Tdu0a7khpd3tcd5/3BlL0fvOtB71Xevbu6GBzI UJ4Q== MIME-Version: 1.0 Received: by 10.52.75.70 with SMTP id a6mr1762200vdw.5.1350661681798; Fri, 19 Oct 2012 08:48:01 -0700 (PDT) Received: by 10.58.68.8 with HTTP; Fri, 19 Oct 2012 08:48:01 -0700 (PDT) In-Reply-To: <50817057.3090200@FreeBSD.org> References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <201210171006.51214.jhb@freebsd.org> <50817057.3090200@FreeBSD.org> Date: Fri, 19 Oct 2012 08:48:01 -0700 Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Jack Vogel To: "Alexander V. Chernikov" Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org, Luigi Rizzo , John Baldwin , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 15:48:03 -0000 On Fri, Oct 19, 2012 at 8:23 AM, Alexander V. Chernikov < melifaro@freebsd.org> wrote: > On 17.10.2012 18:06, John Baldwin wrote: > >> On Monday, October 15, 2012 9:04:27 am John Baldwin wrote: >> >>> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: >>> >>>> On 13.10.2012 23:24, Jack Vogel wrote: >>>> >>>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo >>>>> wrote: >>>>> >>>> >>>> >>>>>> one option could be (same as it is done in the timer >>>>>> routine in dummynet) to build a list of all the packets >>>>>> that need to be sent to if_input(), and then call >>>>>> if_input with the entire list outside the lock. >>>>>> >>>>>> It would be even easier if we modify the various *_input() >>>>>> routines to handle a list of mbufs instead of just one. >>>>>> >>>>> >>>> Bulk processing is generally a good idea we probably should implement. >>>> Probably starting from driver queue ending with marked mbufs >>>> (OURS/forward/legacy processing (appletalk and similar))? >>>> >>>> This can minimize an impact for all >>>> locks on RX side: >>>> L2 >>>> * rx PFIL hook >>>> L3 (both IPv4 and IPv6) >>>> * global IF_ADDR_RLOCK (currently commented out) >>>> * Per-interface ADDR_RLOCK >>>> * PFIL hook >>>> >>>> From the first glance, there can be problems with: >>>> * Increased latency (we should have some kind of rx_process_limit), but >>>> still >>>> * reader locks being acquired for much longer amount of time >>>> >>>> >>>>>> cheers >>>>>> luigi >>>>>> >>>>>> Very interesting idea Luigi, will have to get that some thought. >>>>>> >>>>> >>>>> Jack >>>>> >>>> >>>> Returning to original post topic: >>>> >>>> Given >>>> 1) we are currently binding ixgbe ithreads to CPU cores >>>> 2) RX queue lock is used by (indirectly) in only 2 places: >>>> a) ISR routine (msix or legacy irq) >>>> b) taskqueue routine which is scheduled if some packets remains in RX >>>> queue and rx_process_limit ended OR we need something to TX >>>> >>>> 3) in practice taskqueue routine is a nightmare for many people since >>>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after >>>> some traffic burst happens: once it is called it starts to schedule >>>> itself more and more replacing original ISR routine. Additionally, >>>> increasing rx_process_limit does not help since taskqueue is called with >>>> the same limit. Finally, currently netisr taskq threads are not bound to >>>> any CPU which makes the process even more uncontrollable. >>>> >>> >>> I think part of the problem here is that the taskqueue in ixgbe(4) is >>> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should >>> just start transmitting packets directly. >>> >>> I fixed this in igb(4) here: >>> >>> http://svnweb.freebsd.org/**base?view=revision&revision=**233708 >>> >>> You can try this for ixgbe(4). It also comments out a spurious taskqueue >>> reschedule from the watchdog handler that might also lower the taskqueue >>> usage. You can try changing that #if 0 to an #if 1 to test just the >>> txeof >>> changes: >>> >> >> Is anyone able to test this btw to see if it improves things on ixgbe at >> all? >> (I don't have any ixgbe hardware.) >> > Yes. I'll try to to this next week (since ixgbe driver from at least 9-S > fails to detect twinax cable which works in 8-S....)). > >> >> > If you have a major problem like this you might want to put it in a bug report or at least an email with that specific topic rather than bury it in an unrelated thread in a parenthetic remark :( This is the first I've heard of this, did you check the code on HEAD to see if it also has the issue? Jack