Date: Thu, 13 Nov 2008 16:06:24 -0500 From: John Baldwin <jhb@freebsd.org> To: Alexander Motin <mav@freebsd.org> Cc: Sam Leffler <sam@freebsd.org>, freebsd-mobile@freebsd.org Subject: Re: RFC: powerd algorithms enhancements Message-ID: <200811131606.24804.jhb@freebsd.org> In-Reply-To: <491C9380.7050007@FreeBSD.org> References: <200811060901400000@466321507> <200811131145.39747.jhb@freebsd.org> <491C9380.7050007@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 13 November 2008 03:52:16 pm Alexander Motin wrote: > John Baldwin wrote: > >> If your system completely freezes at 400MHz, then it spends about 20% of > >> CPU time on this at 2GHz. Doesn't it? > > > > Nope. It is usually very idle at full speed. You are free to go buy your own > > HP nc6220 if you want to see it for yourself. You can also grab the KTR > > trace and modified schedgraph.py at www.freebsd.org/~jhb/gpe/. > > It's very strange to me that you have 100% load at 400MHz, but zero at > full speed. It shouldn't be so! I think systems are more complex than you give them credit for. Imagine what CPU frequency changing does to SMI# handlers for example. > Just an idea. I have noticed a problem, that my mobile Core2Duo does not > drops TSC timer frequency on EST. It confuses kernel time counting and > leads to incorrect proportional increasing of DELAY() times. I have > fixed this problem to myself with "kern.timecounter.invariant_tsc=1". > Can't it just be applicable to your CPU? Very, very doubtful. This is a Pentium-M, and I know that the TSC slows down, because until Nate's fixes to make DELAY() work correctly, the 5-second delay on shutdown used to take a lot longer than 5 seconds when I was on battery (after being on A/C). > >>>> I think the only solutions for this case can be in allowing scheduler to > >>>> really do it's job. Or by moving _everything_ out of interrupt threads > >>>> to make them extremely fast and so to avoid the livelock problem, or in > >>>> some other way allow scheduler to delay interrupt processing to allow > >>>> other (for example user-level) threads to obtain at least some part of > >>>> their CPU time slot according to their priorities. > > > > This is completely backwards. Userland is not more important than interrupt > > handling in the kernel. The problem is that CPU frequency handling is more > > important than relegating the entire task to userland. Instead of completely > > breaking the entire userland/kernel model to get part of userland executed at > > a kernel-level priority so CPU frequency handling is partially handled at a > > kernel-level priority, why not just move the CPU frequency bits that need to > > be kernel-level into the kernel? We already doing the thermal management for > > passive cooling in the kernel rather than in userland. > > The fact of system livelocks means that interrupt processing works out > of any priorities! Saying that moving all processing into interrupt > handlers is a good way, you are saying that having _all_ our system out > of any priorities is a good idea. That's actually the situation we are > able to see now with heavy network load with polling disabled. System > just dies and there is no other way to manage that except enabling polling! > > Heavy interrupt handlers is _evil_ from the scheduling point of view! It > may be faster in some situations, but it makes system unmanageable! > There are never will be enough power to fulfill all requirements, so we > must take care about the case when there will be more interrupts then we > are able to handle. I'm not advocating moving the entire system into interrupt handlers. Did you actually read what I wrote? My point is that if you have something in userland that is as important as what gets done in interrupt handlers, the solution is to not rip up the entire scheduler to make certain bits of userland have a higher priority than interrupts. The solution is to move the one bit of userland code that is needed into the kernel. In this case I'm not suggesting moving all of powerd into an interrupt handler. What I am suggesting is that the kernel needs a policy to consider raising the frequency when it gets an interrupt after being in a deep sleep. If the power savings from C2/3/whatever are greater than running throttled, then it is much more ideal when you get an interrupt while idle that you run at full speed to service the interrupt and then return to C2/C3 ASAP rather than running the interrupt handler at a throttled speed and spending less time in C2/C3. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200811131606.24804.jhb>