Date: Mon, 30 Aug 2010 00:24:58 -0500 From: Brandon Gooch <jamesbrandongooch@gmail.com> To: Alexander Motin <mav@freebsd.org> Cc: freebsd-hackers@freebsd.org, FreeBSD-Current <freebsd-current@freebsd.org> Subject: Re: One-shot-oriented event timers management Message-ID: <AANLkTi=uSbOjxGT-1O-Su3YE7%2B_2R7jdjp7vFe6B_XEX@mail.gmail.com> In-Reply-To: <4C7A5C28.1090904@FreeBSD.org> References: <4C7A5C28.1090904@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
2010/8/29 Alexander Motin <mav@freebsd.org>: > Hi. > > I would like to present my new work on timers management code. > > In my previous work I was mostly orienting on reimplementing existing > functionality in better way. The result seemed not bad, but after > looking on perspectives of using event timers in one-shot (aperiodic) > mode I've understood that implemented code complexity made it hardly > possible. So I had to significantly cut it down and rewrite from the new > approach, which is instead primarily oriented on using timers in > one-shot mode. As soon as some systems have only periodic timers I have > left that functionality, though it was slightly limited. > > New management code implements two modes of operation: one-shot and > periodic. Specific mode to be used depends on hardware capabilities and > can be controlled. > > In one-shot mode hardware timers programmed to generate single interrupt > precisely at the time of next wanted event. It is done by comparing > current binuptime with next scheduled times of system events > (hard-/stat-/profclock). This approach has several benefits: event timer > precision is now irrelevant for system timekeeping, hard- and statclocks > are not aliased, while only one timer used for it, and the most > important -- it allows us to define which events and when exactly we > really want to handle, without strict dependence on fixed hz, stathz, > profhz periods. Sure, our callout system is highly depends on hz value, > but now at least we can skip interrupts when we have no callouts to > handle at the time. Later we can go further. > > Periodic mode now also uses alike principals of scheduling events. But > timer running in periodic mode just unable to handle arbitrary events > and as soon as event timers may not be synchronized to system > timecounter and may drift from it, causing jitter effects. So I've used > for time source of scheduling the timer events themselves. As result, > periodic timer runs on fixed frequency multiply to hz rate, while > statclock and profclock generated by dividing it respectively. (If > somebody would tell me that hardclock jitter is not really a big > problem, I would happily rip that artificial timekeeping out of there to > simplify code.) Unluckily this approach makes impossible to use two > events timers to completely separate hard- and statclocks any more, but > as I have said, this mode is required only for limited set of systems > without one-shot capable timers. Looking on my recent experience with > different platforms, it is not a big fraction. > > Management code is still handles both per-CPU and global timers. Per-CPU > timers usage is obvious. Global timer is programmed to handle all CPUs > needs. In periodic mode global timer generates periodic interrupts to > some one CPU, while management code then redistributes them to CPUs that > really need it, using IPI. In one-shot mode timer is always programmed > to handle first scheduled event throughout the system. When that > interrupt arrives, it is also getting redistributed to wanting CPUs with > IPI. > > To demonstrate features that could be obtained from so high flexibility > I have incorporated the idea and some parts of dynamic ticks patches of > Tsuyoshi Ozawa. Now, when some CPU goes down into C2/C3 ACPI sleep > state, that CPU stops scheduling of hard-/stat-/profclock events until > the next registered callout event. If CPU wakes up before that time by > some unrelated interrupt, missed ticks are called artificially (it is > needed now to keep realistic system stats). After system is up to date, > interrupt is handled. Now it is implemented only for ACPI systems with > C2/C3 states support, because ACPI resumes CPU with interrupts disabled, > that allows to keep up missed time before interrupt handler or some > other process (in case of unexpected task switch) may need it. As I can > see, Linux does alike things in the beginning of every interrupt handler. > > I have actively tested this code for a few days on my amd64 Core2Duo > laptop and i386 Core-i5 desktop system. With C2/C3 states enabled > systems experience only about 100-150 interrupts per second, having HZ > set to 1000. These events mostly caused by several event-greedy > processes in our tree. I have traced and hacked several most aggressive > ones in this patch: http://people.freebsd.org/~mav/tm6292_idle.patch . > It allowed me to reduce down to as low as 50 interrupts per system, > including IPIs! Here is the output of `systat -vm 1` from my test > system: http://people.freebsd.org/~mav/systat_w_oneshot.txt . Obviously > that with additional tuning the results can be improved even more. > > My latest patch against 9-CURRENT can be found here: > http://people.freebsd.org/~mav/timers_oneshot4.patch > > Comments, ideas, propositions -- welcome! > > Thanks to all who read this. ;) Totally awesome work mav@! One thing I see: Where is *frame pointing to? It isn't initialized in the function, so... +static int +handleevents(struct bintime *now, int fake) { + struct trapframe *frame; + struct pcpu_state *state; + uintfptr_t pc; + int usermode; + int done; - if (doconfigtimer(0)) - return (FILTER_HANDLED); - return (hardclockhandler(frame)); + done = 0; +#ifdef KDTRACE_HOOKS + /* + * If the DTrace hooks are configured and a callback function + * has been registered, then call it to process the high speed + * timers. + */ + if (cyclic_clock_func[curcpu] != NULL) + (*cyclic_clock_func[curcpu])(frame); +#endif Also, for those of us testing, should we "reset" our timer settings back to defaults and work from there[1] (meaning, should we be futzing around with timer event sources, kern.hz, etc...)? Thanks again for tackling these tough, but important issues. I'm looking very forward to testing this out! -Brandon [1] http://wiki.freebsd.org/TuningPowerConsumption
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=uSbOjxGT-1O-Su3YE7%2B_2R7jdjp7vFe6B_XEX>