Date: Fri, 30 Nov 2001 08:44:21 -0800 From: Luigi Rizzo <rizzo@aciri.org> To: Bruce Evans <bde@zeta.org.au> Cc: net@FreeBSD.ORG Subject: Re: Revised polling code for STABLE Message-ID: <20011130084421.A30672@iguana.aciri.org> In-Reply-To: <20011130174232.R347-100000@gamplex.bde.org> References: <20011129063116.B19430@iguana.aciri.org> <20011130174232.R347-100000@gamplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Nov 30, 2001 at 06:37:33PM +1100, Bruce Evans wrote: > On Thu, 29 Nov 2001, Luigi Rizzo wrote: > > > The call to update_poll_threshold() however needs to be done by > > either hardclock() or statclock() (i am experimenting with that ... > Why in that particular interrupt? If you do it in a timeout routine, i'll try to explain. One of the goal of polling is to be able to adapt the fraction of CPU dedicated to polling so that it matches some user-programmable threshold -- and in any case, make sure that you are never using 100% of the CPU doing polling because in this case you would have livelock and a non-responsive system. The amount of polling is (roughly) controlled by a variable, "poll_burst", which tells each driver how many packets to grab at most from the card at each xxx_poll() invocation. The goal of the control system is to dynamically adjust poll_burst depending on actual CPU speed, incoming traffic etc. Out of the infinitely many ways of doing this, i am experimenting with a couple: + (what is implemented now) Make sure that ether_poll() is run as the first thing after hardclock. Then in hardclock(), check if the CPU is still in the previous instance of ether_poll(). If so, reduce poll_burst (currently, halve it), otherwise increase it by one every N clock ticks. This serves to detect potential livelock (ether_poll() consuming an entire tick) and react accordingly. + (what i want to test next). Use the profiling clock to sample cpu state, and determine (over the long term) what fraction of these samples finds the CPU busy in ether_poll(). If the fraction is higher than a programmed threshold then reduce poll_burst, otherwise increment it. This might give a more fine-grained control, but only works on a longer scale term, and still I should check The first scheme is just a stopgap measure, it works, but it is rather coarse. And it needs to run ether_poll() right after hardclock() otherwise i would have to deal with the extra delay and I am not very clear on how to do that. For the second scheme i probably do not need to run ether_poll() from hardclock. But as you notice there are issues on how fair the sampling would be, which I haven't had the time to deal with yet. > In -current, hardclock() and softclock() are called in fast interrupt > context, so adding to them is both nontrivial and BAD (it requires well the code i am adding is really short, look at the patch! It boils down to reading one variable and updating poll_burst accordingly. Plus the polling code is not designed for SMP anyways, you cannot specify both ETHER_POLLING and SMP options (I already tried to explain why i think polling is mostly useless with SMP). > > I do not know how/if I can replace the schednetisr() with an ordinary timeout: > > i need the handler to be invoked as soon as possible after the > > clock ticks, as the task it performs are as urgent as interrupt requests. > > As urgent as hardclock interrupts? I'd say yes: as urgent, but with a lower priority. If there are packets to fetch from the cards, those are packets for which there would be an interrupt pending in a traditional system, so they would be processed after hardclock but before pending soft interrupts. I would like to keep this prioritization as much as possible. I have considered adding another interrupt class (as a matter of fact, i even did that) with higher priority than other SWI, but i thought it was overkill because splsoftnet(NETISR_POLL) still seems to be the highest priority soft interrupt and the change is less intrusive. Am I wrong on this ? > Ugh. Use a standard FreeBSD C interface for this (critical_exit() in > -current; enable_intr() for i386's only in RELENG_4). ok, i just copied the "sti"/"cli" from vm_page_zero_idle() ! > > retval = ether_poll(poll_burst); > > __asm __volatile("cli" : : : "memory"); > > splx(s); > > vm_page_zero_idle(); > > return 1; > > } else > > return vm_page_zero_idle(); > > } > > It's ugly for idle_poll() to know about all the other poll routines > (we've had other there before, e.g., one for apm). Someone has to know-- this idle_poll() is basically a part of the idle_loop written in C instead of assembler, the logic can be a bit convoluted so i think this is much more readable this way. Does the above make sense ? cheers luigi ----------------------------------+----------------------------------------- Luigi RIZZO, luigi@iet.unipi.it . ACIRI/ICSI (on leave from Univ. di Pisa) http://www.iet.unipi.it/~luigi/ . 1947 Center St, Berkeley CA 94704 Phone: (510) 666 2927 ----------------------------------+----------------------------------------- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011130084421.A30672>