Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Nov 2001 08:44:21 -0800
From:      Luigi Rizzo <rizzo@aciri.org>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        net@FreeBSD.ORG
Subject:   Re: Revised polling code for STABLE
Message-ID:  <20011130084421.A30672@iguana.aciri.org>
In-Reply-To: <20011130174232.R347-100000@gamplex.bde.org>
References:  <20011129063116.B19430@iguana.aciri.org> <20011130174232.R347-100000@gamplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Nov 30, 2001 at 06:37:33PM +1100, Bruce Evans wrote:
> On Thu, 29 Nov 2001, Luigi Rizzo wrote:
> 
> > The call to update_poll_threshold() however needs to be done by
> > either hardclock() or statclock() (i am experimenting with that
...
> Why in that particular interrupt?  If you do it in a timeout routine,

i'll try to explain.

One of the goal of polling is to be able to adapt the fraction of
CPU dedicated to polling so that it matches some user-programmable
threshold -- and in any case, make sure that you are never using
100% of the CPU doing polling because in this case you would have
livelock and a non-responsive system.

The amount of polling is (roughly) controlled by a variable, "poll_burst",
which tells each driver how many packets to grab at most from the card
at each xxx_poll() invocation. The goal of the control system is to
dynamically adjust poll_burst depending on actual CPU speed, incoming
traffic etc.

Out of the infinitely many ways of doing this, i am experimenting with
a couple:

 + (what is implemented now) Make sure that ether_poll() is run
   as the first thing after hardclock. Then in hardclock(), check
   if the CPU is still in the previous instance of ether_poll().
   If so, reduce poll_burst (currently, halve it), otherwise
   increase it by one every N clock ticks.
   This serves to detect potential livelock (ether_poll() consuming
   an entire tick) and react accordingly.

 + (what i want to test next). Use the profiling clock to sample
   cpu state, and determine (over the long term) what fraction of
   these samples finds the CPU busy in ether_poll(). If the fraction
   is higher than a programmed threshold then reduce poll_burst,
   otherwise increment it.
   This might give a more fine-grained control, but only works
   on a longer scale term, and still I should check 

The first scheme is just a stopgap measure, it works, but it is
rather coarse. And it needs to run ether_poll() right after
hardclock() otherwise i would have to deal with the extra delay
and I am not very clear on how to do that.

For the second scheme i probably do not need to run ether_poll()
from hardclock. But as you notice there are issues on how fair the
sampling would be, which I haven't had the time to deal with yet.

> In -current, hardclock() and softclock() are called in fast interrupt
> context, so adding to them is both nontrivial and BAD (it requires

well the code i am adding is really short, look at the patch!
It boils down to reading one variable and updating poll_burst
accordingly.
Plus the polling code is not designed for SMP anyways, you cannot
specify both ETHER_POLLING and SMP options (I already tried to
explain why i think polling is mostly useless with SMP).

> > I do not know how/if I can replace the schednetisr() with an ordinary timeout:
> > i need the handler to be invoked as soon as possible after the
> > clock ticks, as the task it performs are as urgent as interrupt requests.
> 
> As urgent as hardclock interrupts?

I'd say yes: as urgent, but with a lower priority.
If there are packets to fetch from the cards, those are packets
for which there would be an interrupt pending in a traditional
system, so they would be processed after hardclock but before
pending soft interrupts.  I would like to keep this prioritization
as much as possible.

I have considered adding another interrupt class (as a matter of fact,
i even did that) with higher priority than other SWI, but i thought
it was overkill because splsoftnet(NETISR_POLL) still seems to be
the highest priority soft interrupt and the change is less intrusive.
Am I wrong on this ?

> Ugh.  Use a standard FreeBSD C interface for this (critical_exit() in
> -current; enable_intr() for i386's only in RELENG_4).

ok, i just copied the "sti"/"cli" from vm_page_zero_idle() !

> > 		    retval = ether_poll(poll_burst);
> > 		    __asm __volatile("cli" : : : "memory");
> > 		    splx(s);
> > 		    vm_page_zero_idle();
> > 		    return 1;
> > 		} else
> > 		    return vm_page_zero_idle();
> > 	}
> 
> It's ugly for idle_poll() to know about all the other poll routines
> (we've had other there before, e.g., one for apm).

Someone has to know-- this idle_poll() is basically a part of the
idle_loop written in C instead of assembler, the logic can be a bit
convoluted so i think this is much more readable this way.

Does the above make sense ?

	cheers
	luigi
----------------------------------+-----------------------------------------
 Luigi RIZZO, luigi@iet.unipi.it  . ACIRI/ICSI (on leave from Univ. di Pisa)
 http://www.iet.unipi.it/~luigi/  . 1947 Center St, Berkeley CA 94704
 Phone: (510) 666 2927
----------------------------------+-----------------------------------------

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011130084421.A30672>