From owner-freebsd-current Thu Sep 17 00:43:15 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id AAA13798 for freebsd-current-outgoing; Thu, 17 Sep 1998 00:43:15 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.40.131]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id AAA13765; Thu, 17 Sep 1998 00:42:41 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.9.1/8.8.5) with ESMTP id JAA08622; Thu, 17 Sep 1998 09:37:24 +0200 (CEST) To: Mike Smith cc: current@FreeBSD.ORG, bde@FreeBSD.ORG Subject: Re: Death by SIGXCPU (problems with our clock code) In-reply-to: Your message of "Wed, 16 Sep 1998 17:55:15 PDT." <199809170055.RAA01288@dingo.cdrom.com> Date: Thu, 17 Sep 1998 09:37:22 +0200 Message-ID: <8620.906017842@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Mike, >Since nobody else has taken up my suggestion to instrument the code to >find out what's going on, I've shouldered the cross. I have my version here instrumented far more than what you've done, and I have two and a half extra kinds of timecounting hardware running here in my lab but I have not been able to catch it in flagrante delico yet, leading me to conclude that some hardware is involved. The check in microtime and nanotime are strictly not valid. The reason is that both microtime and nanotime are reentrant now, so you might take an interrupt in the middle of it. The reentrancy could possibly be a problem if some spl*() are missing somewhere else, or if logic is flawed in the code. You can test that hypothesis by splalot() around the [get]{micro|nano}[run]time() functions. I am puzzeled about the negative fractions and I think they are the most important clue. tco_forward() does not do any sanity checks on the timecounter, so if there is some trouble with the hardware (or our method of accessing it), that would shine straight through. Can you please add a check to the i8254/tsc get_timecount functions (in isa/clock.c) which report if the count goes backwards or is bigger than (1/HZ + epsilon) seconds ? What machine is this on ? What is your timecounter TSC/i8254 ? BIOS settings ? Bruce: You mentioned that some i8254 cloneoids didn't implement the latch correctly any references to that ? Poul-Henning -- Poul-Henning Kamp FreeBSD coreteam member phk@FreeBSD.ORG "Real hackers run -current on their laptop." "ttyv0" -- What UNIX calls a $20K state-of-the-art, 3D, hi-res color terminal To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message