Date: Wed, 17 Oct 2012 10:06:51 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-net@freebsd.org Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, Luigi Rizzo <rizzo@iet.unipi.it>, Jack Vogel <jfvogel@gmail.com>, net@freebsd.org Subject: Re: ixgbe & if_igb RX ring locking Message-ID: <201210171006.51214.jhb@freebsd.org> In-Reply-To: <201210150904.27567.jhb@freebsd.org> References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, October 15, 2012 9:04:27 am John Baldwin wrote: > On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: > > On 13.10.2012 23:24, Jack Vogel wrote: > > > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it> wrote: > > > > >> > > >> one option could be (same as it is done in the timer > > >> routine in dummynet) to build a list of all the packets > > >> that need to be sent to if_input(), and then call > > >> if_input with the entire list outside the lock. > > >> > > >> It would be even easier if we modify the various *_input() > > >> routines to handle a list of mbufs instead of just one. > > > > Bulk processing is generally a good idea we probably should implement. > > Probably starting from driver queue ending with marked mbufs > > (OURS/forward/legacy processing (appletalk and similar))? > > > > This can minimize an impact for all > > locks on RX side: > > L2 > > * rx PFIL hook > > L3 (both IPv4 and IPv6) > > * global IF_ADDR_RLOCK (currently commented out) > > * Per-interface ADDR_RLOCK > > * PFIL hook > > > > From the first glance, there can be problems with: > > * Increased latency (we should have some kind of rx_process_limit), but > > still > > * reader locks being acquired for much longer amount of time > > > > >> > > >> cheers > > >> luigi > > >> > > >> Very interesting idea Luigi, will have to get that some thought. > > > > > > Jack > > > > Returning to original post topic: > > > > Given > > 1) we are currently binding ixgbe ithreads to CPU cores > > 2) RX queue lock is used by (indirectly) in only 2 places: > > a) ISR routine (msix or legacy irq) > > b) taskqueue routine which is scheduled if some packets remains in RX > > queue and rx_process_limit ended OR we need something to TX > > > > 3) in practice taskqueue routine is a nightmare for many people since > > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after > > some traffic burst happens: once it is called it starts to schedule > > itself more and more replacing original ISR routine. Additionally, > > increasing rx_process_limit does not help since taskqueue is called with > > the same limit. Finally, currently netisr taskq threads are not bound to > > any CPU which makes the process even more uncontrollable. > > I think part of the problem here is that the taskqueue in ixgbe(4) is > bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should > just start transmitting packets directly. > > I fixed this in igb(4) here: > > http://svnweb.freebsd.org/base?view=revision&revision=233708 > > You can try this for ixgbe(4). It also comments out a spurious taskqueue > reschedule from the watchdog handler that might also lower the taskqueue > usage. You can try changing that #if 0 to an #if 1 to test just the txeof > changes: Is anyone able to test this btw to see if it improves things on ixgbe at all? (I don't have any ixgbe hardware.) -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201210171006.51214.jhb>