From owner-freebsd-bugs@FreeBSD.ORG Fri Feb 9 05:10:26 2007 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0827E16A40E for ; Fri, 9 Feb 2007 05:10:26 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id EBA5113C4BC for ; Fri, 9 Feb 2007 05:10:25 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l195APa9044856 for ; Fri, 9 Feb 2007 05:10:25 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l195AP24044855; Fri, 9 Feb 2007 05:10:25 GMT (envelope-from gnats) Date: Fri, 9 Feb 2007 05:10:25 GMT Message-Id: <200702090510.l195AP24044855@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Bruce Evans Cc: Subject: Re: kern/108954: 'sleep(1)' sleeps >1 seconds when speedstep (Cx) is in economy mode X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Bruce Evans List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Feb 2007 05:10:26 -0000 The following reply was made to PR kern/108954; it has been noted by GNATS. From: Bruce Evans To: Brad Huntting Cc: FreeBSD-gnats-submit@freebsd.org, freebsd-bugs@freebsd.org Subject: Re: kern/108954: 'sleep(1)' sleeps >1 seconds when speedstep (Cx) is in economy mode Date: Fri, 9 Feb 2007 16:08:49 +1100 (EST) On Thu, 8 Feb 2007, Brad Huntting wrote: >> Description: > On some machines (those supporting Intel speedstep), > nanosleep(2) (and presumably select(2)) are confused by cpu > frequency changes and wind up over sleeping. Do they work without the lapic timer? (Not configuring "device apic" is the only easy way to avoid using the lapic timer. I forget if acpi can work without apic.) On some systems, the lapic timer doesn't work at all because the CPU enters a deep sleep on the hlt instruction in the idle process, and one workaround is to run other timers at a higher frequency than the lapic timer frequency to kick the CPU out of its deep sleep and thus keep the lapic timer interrupting. >> How-To-Repeat: > > /bin/sh -c 't0=`date +%s`; sleep 1; t1=`date +%s`; expr $t1 - $t0' > > On a normal machine this should almost always spit out '1'. > > On a Centrino or Pentium-M based laptop (such as the Panasonic > CF-W4), with hw.acpi.cpu.cx_lowest set to something other > than C1, this produces '4' or '5'. > > Note: If you can reproduce this, _please_ post a follow > up so I know I'm not insane. > > The problem seems to be that when 'sysctl hw.acpi.cpu.cx_lowest' > is set to anything other than 'full speed' (aka 'C1') the > cpu frequency is generally (and unpredictably) slower than > C1 speed. tvtohz(9) (located in /sys/kern/kern_clock.c) > assumes a static frequency and so returns several times the > correct number of tics. The frequency used by tvtohz() is required to be fixed. Since it is used mainly for timeouts, the frequency isn't required to be very accurate, but it should be accurate to within a few percent and not wrong by a factor of 5. > $ sysctl hw.acpi.cpu dev.cpu.0.freq_levels kern.timecounter.choice kern.timecounter.hardware > hw.acpi.cpu.cx_supported: C1/1 C2/1 C3/85 > hw.acpi.cpu.cx_lowest: C3 > hw.acpi.cpu.cx_usage: 0.00% 13.11% 86.88% > dev.cpu.0.freq_levels: 1200/-1 1100/-1 1000/-1 900/-1 800/-1 700/-1 600/-1 525/-1 450/-1 375/-1 300/-1 225/-1 150/-1 75/-1 > kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000) > kern.timecounter.hardware: ACPI-fast The timecounter is not really involved here. It is only used to check the time (not quite correctly) after the timeout. That would fix avoid the problem if the timeout is too short but not if it is too long. >> Fix: > > The ideal solution would be to use a clock who's frequency > is not jerked around by speedstep. Perhaps this is just a > hardware bug, but seem to recall seeing this behavior on > my previous Intel Centrino based laptop as well. The i8254 timer (not timecounter) is supposed to have this property. Maybe the lapic timer doesn't. > Fixing nanosleep(2) (and select(2)) alone would be relatively > easy: Since they loop, returning to the user only when the > correct wakeup time has arrived (microtime(9) is apparently > not affected by this problem), one could just have tvtohz(9) > return the number of ticks based on the _lowest_ cpu frequency > rather than the _highest_. Unfortunately, this makes other > users of tvtohz(9) wake up early, and they may not be as > prepared to handle this. Yes, that should be OK as a workaround. One of the things that nanosleep() etc. don't do quite right is related: for very long sleeps, the calculated timeout may be more than 1 tick too long due to clock drift or just the limited resolution of the scale factor used in tvtohz(). That should be handled by using the _lowest_ possible scale factor rather than the nominal one. This could also be used to ensure that the final timeout is minimal (tvtohz() rounds up and then adds 1 to ensure that the timeout is long enough, so an average timeout is 1.5 ticks longer than strictly necessary; by not adding 1 but checking whether the timeout has expired on waking up, it is possible to make an average timeout only 0.5 ticks longer than necessary). There should be a new interface for callers that are prepared to handle this (or they can subtract 1 and rescale). Waking up early also wastes time so it shouldn't usually be done. Bruce