From owner-freebsd-current Fri Dec 6 04:34:00 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id EAA01307 for current-outgoing; Fri, 6 Dec 1996 04:34:00 -0800 (PST) Received: from godzilla.zeta.org.au (godzilla.zeta.org.au [203.2.228.19]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id EAA01279 for ; Fri, 6 Dec 1996 04:33:52 -0800 (PST) Received: (from bde@localhost) by godzilla.zeta.org.au (8.8.3/8.6.9) id XAA24120; Fri, 6 Dec 1996 23:28:29 +1100 Date: Fri, 6 Dec 1996 23:28:29 +1100 From: Bruce Evans Message-Id: <199612061228.XAA24120@godzilla.zeta.org.au> To: current@freebsd.org, ohashi@mickey.ai.kyutech.ac.jp Subject: Re: DELAY() in clock.c Sender: owner-current@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > I am a new subscriber of currnet, and I reported a bug of clock.c. > >>Category: i386 >>Responsible: freebsd-bugs >>Synopsis: DELAY() won't work for fast CPUs >>Arrival-Date: Mon Nov 4 21:00:01 PST 1996 > > This problem would make hanging up the system, especially that is >a Pentium or Pentum Pro box. For example, my Pentium Pro 200 box hung >up by keybord and NIC. I reported a quick bug fix patch with it. > > We discussed the patch in freebsd-users-jp@jp.freebsd.org. >Many users said that it was a serious problem and the patch fixed it. > > We made a new patch and tested it. We want to commit it as soon as >possible. Please check the following patch for /usr/src/sys/i386/isa/clock.c. I like this version better. There are still 2 problems: 1. If the user boots with the turbo button off (possibly to fix problems caused by DELAY() not being long enough) then the calibration will be wrong. Fix: recalibrate every now and then, perhaps when DELAY() is called with a large arg. This won't fixed problems caused by devices hanging because delays are too short, but can't hurt. 2. The calibration method has some problems. >+ /* some machine has too small max_count */ >+ delay = timer0_max_count / 10 * 1000000 / timer_freq; >+ do { >+ start_tick = getit(); >+ DELAY(delay); >+ end_tick = getit(); >+ } while (start_tick <= end_tick); >+ /* avoid overflow */ >+ delay_offset = (start_tick - end_tick) * 1000000.0 / timer_freq - delay; (a) floating point must not be used in the kernel. It's not initialized at this point, and may not exist in hardware. Use 1000000LL instead of 1000000.0 or use reduce 1000000 and reduce `delay' a little to ensure that (start_tick - end_tick) < 4295 (it can be up to 5500 now if `hz' is nonstandard). (b) you need to subtract the time for one getit() call. Underestimating delay_offset doesn't hurt much, but overestimating it gives the same problems as now, except not as large (unless it is overestimated by 20 :-) (c) the calibration should be done and the results of the first few iterations should be discarded. (I should do this for calibrate_clocks() too :-). The few iterations are to load everything into a cache. More than one iteration is required to load the Pentium Branch Target Buffer in some cases. It wouldn't hurt to do a few more iterations and pick the lowest value. Summary: the calibration method should be something like this: u_int delta_ticks; u_long ef; int i, min_ticks; ... min_ticks = INT_MAX; for (i = 0; i < 100; ++i) { ef = read_eflags(); disable_intr(); start_tick = getit(); DELAY(10); end_tick = getit(); write_eflags(ef); delta_ticks = start_tick - end_tick; if (min_ticks > delta_ticks) min_ticks = delta_ticks; } if (min_ticks == INT_MAX) oops(); Please test this. Notes: (i) Interrupts are disabled so that this can be used to recalibrate after boot time. (ii) The delay is short so that interrupts aren't disabled for long. It just has to be long enough for DELAY() to do the usual things. (iii) Taking the minimum also fixes the problem that DELAY() often won't do the usual things. A few nsec for a cache miss may cause it to do an extra getit() call. (iv) If start_tick < end_tick, then delta_ticks is large and will be discarded. (v) delta_ticks will probably be larger before the cache is warmed up so the first few iterations probably won't do anything except warm up the cache. (vi) Perhaps more iterations are necessary for recalibration. You can look at the interrupt count to see if the loop has been interrupted and discard the results if it was. Bruce