Date: Mon, 15 Oct 2012 18:10:40 +0400 From: "Alexander V. Chernikov" <melifaro@FreeBSD.org> To: Jack Vogel <jfvogel@gmail.com> Cc: Luigi Rizzo <rizzo@iet.unipi.it>, net@freebsd.org Subject: Re: ixgbe & if_igb RX ring locking Message-ID: <507C1960.6050500@FreeBSD.org> In-Reply-To: <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com> References: <5079A9A1.4070403@FreeBSD.org> <20121013182223.GA73341@onelab2.iet.unipi.it> <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 13.10.2012 23:24, Jack Vogel wrote: > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it> wrote: >> >> one option could be (same as it is done in the timer >> routine in dummynet) to build a list of all the packets >> that need to be sent to if_input(), and then call >> if_input with the entire list outside the lock. >> >> It would be even easier if we modify the various *_input() >> routines to handle a list of mbufs instead of just one. Bulk processing is generally a good idea we probably should implement. Probably starting from driver queue ending with marked mbufs (OURS/forward/legacy processing (appletalk and similar))? This can minimize an impact for all locks on RX side: L2 * rx PFIL hook L3 (both IPv4 and IPv6) * global IF_ADDR_RLOCK (currently commented out) * Per-interface ADDR_RLOCK * PFIL hook From the first glance, there can be problems with: * Increased latency (we should have some kind of rx_process_limit), but still * reader locks being acquired for much longer amount of time >> >> cheers >> luigi >> >> Very interesting idea Luigi, will have to get that some thought. > > Jack Returning to original post topic: Given 1) we are currently binding ixgbe ithreads to CPU cores 2) RX queue lock is used by (indirectly) in only 2 places: a) ISR routine (msix or legacy irq) b) taskqueue routine which is scheduled if some packets remains in RX queue and rx_process_limit ended OR we need something to TX 3) in practice taskqueue routine is a nightmare for many people since there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after some traffic burst happens: once it is called it starts to schedule itself more and more replacing original ISR routine. Additionally, increasing rx_process_limit does not help since taskqueue is called with the same limit. Finally, currently netisr taskq threads are not bound to any CPU which makes the process even more uncontrollable. Maybe we can rethink taskqueue usage for RX processing? I mean, taskq is called if host fails to process packets in ring fast enough, which can happen when: * traffic burst happens on some (or all) queue * traffic ratio is too high. In former case we have ring buffer size which can be tuned by administrator to fairly big value. For latter case: If all system CPUs are used for RX processing moving some uncontrolled percent of load to random CPU definitely does no good (especially given that ixgbe has AIM and RX indirection table for that purposes which can give much more predictable results) It does even more evil in case of special setups like rx_queues=CPU_COUNT-1 and the last CPU is used by all other processes including control plane one (routing software, various keepalives). If system has more CPUs (24 vs 16 queues, for example) there is standard way for distributing load: netisr and deferred processing. Netisr threads are already CPU-bound, and, more important, splitting packets to different threads can be done by performing some (say, L3+L4) hash computation which will not lead to out-of-order packet processing. > >> So my questions are: >>> >>> Can any real LORs happen in some complex setup? (I can't imagine any). >>> If so: maybe we can somehow avoid/workaround such cases? (and consider >>> removing those locks). >>> >>> >>> >>> -- >>> WBR, Alexander >>> >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?507C1960.6050500>