Date: Thu, 23 Apr 2009 19:04:08 +0000 From: Ed Maste <emaste@freebsd.org> To: Andrew Brampton <brampton+freebsd-net@gmail.com> Cc: attilio@freebsd.org, freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it> Subject: Re: Interrupts + Polling mode (similar to Linux's NAPI) Message-ID: <20090423190408.GA65895@jem.dhs.org> In-Reply-To: <d41814900903270405p19d26d94r7c7351adca05f283@mail.gmail.com> References: <d41814900903261747v28d3de29t10bb1b8128de635c@mail.gmail.com> <20090327071742.GA87385@onelab2.iet.unipi.it> <d41814900903270405p19d26d94r7c7351adca05f283@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 27, 2009 at 11:05:00AM +0000, Andrew Brampton wrote: > 2009/3/27 Luigi Rizzo <rizzo@iet.unipi.it>: > > The load of polling is pretty low (within 1% or so) even with > > polling. The advantage of having interrupts is faster response > > to incoming traffic, not CPU load. > > oh, I was under the impression that polling spun in a tight loop, thus > using 100% of the processor. After a quick test I see this is not the > case. I assume it will get to 100% CPU load if I saturate my network. Yes, polling has a limit on the maximum CPU time it will use, and also will use less than the limit if there is no traffic. There are a number of sysctls under kern.polling that control its behaviour: * kern.polling.user_frac: Desired user fraction of cpu time This attempts to reserve at least a specified percentage of available CPU time for user processes; polling tries to limit its percentage use to 100 less this value. * kern.polling.burst: Current polling burst size * kern.polling.burst_max: Max Polling burst size * kern.polling.each_burst: Max size of each burst These three control the number of packets that polling processes per call / tick. Packets are processed in batches of each_burst, up to burst packets total per tick. The value of burst is capped at busrt_max. In order to keep the user_frac CPU percentage available for non-polling, a feedback loop is used that controls the value of burst. Each time a bach of packets is processed, burst is incremented or decremented by 1, depending on how much CPU time polling actually used. In addition, if polling extends beyond the next tick it's scaled back to 7/8ths of the current value. Polling was originally implemented as a livelock-avoidance technique for the single-core case -- the primary goal is to guarantee the availability of CPU time specified in user_frac. The current algorithm does not behave that well if user_frac is set low. Setting it low is reasonable if the workload is largely in-kernel (i.e., bridging or routing), or when running SMP. Another downside of the current implementation is that interfaces will be polled multiple times per tick (burst / each_burst times), even if there are no packets to process. At work we've developed a replacement polling algorithm that keeps track of the actual amount of time spent per packet, and uses that as the feedback to control the number of packets in each batch. This work requires a change to the polling KPI: the polling handlers have to return the count of packets actually handled. My hope is to get the KPI change committed in time for 8.0, even if we don't switch the algorithm yet. Attilio (on CC:) and I will make the patch set for the KPI change available shortly for comment. -Ed
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090423190408.GA65895>