From owner-freebsd-current@FreeBSD.ORG Sat Jan 24 10:01:15 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 906D816A4CE for ; Sat, 24 Jan 2004 10:01:15 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 421E243D39 for ; Sat, 24 Jan 2004 10:01:13 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.10/8.12.10) with ESMTP id i0OI1AuO044308; Sat, 24 Jan 2004 19:01:11 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Bill Moran From: "Poul-Henning Kamp" In-Reply-To: Your message of "Sat, 24 Jan 2004 09:25:47 EST." <4012806B.7090102@potentialtech.com> Date: Sat, 24 Jan 2004 19:01:10 +0100 Message-ID: <44307.1074967270@critter.freebsd.dk> cc: current@freebsd.org Subject: Re: DragonflyBSD kernel clock improvements X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Jan 2004 18:01:15 -0000 In message <4012806B.7090102@potentialtech.com>, Bill Moran writes: >I saw this recently: >http://www.dragonflybsd.org/Docs/nanosleep/ > >I was wondering if anyone on the FreeBSD team has looked at this. It doesn't >appear as if any recent change have been made in the FreeBSD tree regarding >this. I just at it. >It's probably not a big deal, but I just thought I'd point it out. Well, it is as big a deal as you make it :-) We have never aspired to be a RTos so far, so nothing has really required us to support precise sleeps. If we want to do that, it's certainly possible to do it, and even possible to do it right as opposed to the mistakes presented on the URL above, but as I'll explain in a moment, probably not worth touching in the first place. The measurements he presents on that page are irrellevant: His sleep periods obviously must start at the beginning of a tick, otherwise his data would show +/- 1 tick jitter [1]. In other words, the performance he shows only applies to a process which just got scheduled because of timer and then promptly goes to sleep again, not a big market IMO. If he did some actual I/O or CPU-churn, his process would start the sleep at a random point between two Hz interrupts and his sleep duration consequently suffer from half of 1/Hz phase noise (= +/- 1 tick jitter) This is also why PLL'ing Hz misleads him to think he gets better results: he removes the jittering factor (the frequency offset between his Hz and his timecounter) which otherwise would have averaged out and given him a tell-tale standard deviation alerting him to his mistake, provided he had done the measurement 101 calculation of statistics on a repeated measurement in the first place. In real life PLL'ing Hz will make no measurable difference because the 1/hz jitter from the start time of the sleep period will be a much larger fuzz factor. The PLL will however make a difference in some extreme round-off cases which do not happen until 1/Hz < frequency error (ie typically not until HZ >> 10000). The fact that he neither notices the absence of a +/- 1 tick jitter in his data, nor the source of the sawtooth does not indicate a deep of insight into the mechanisms he is trying to measure. If you want to do it right, you need a programmable interrupting timer so that you can program it to interrupt (a calibrated amount of time before) the next time you need to schedule something. This is called "deadline interrupting" or "deadline timers" and as a method it becomes more efficient than regular hearthbeat (like Hz) when you want to get very high resolution/precision on your timeouts. Now, before you jump in and start coding, a lot of other factors need to be looked at as well. You will never get the precision/resolution of sleeps better than the sum off: worst case interrupt latency + lock resolution + (= "premtion delay") context switch + VM activation (if the target is paged out or on pages which must be activated by page-faults first) + cache flushes + fudgefactor for busmaster hogs slowing the CPU down. If your hardware suffers from "misfeatures" like CPU-throtteling you need to figure that in too. The sum above is a pretty high number of microseconds for a normal unix kernel and makes the deadline interrupting rather pointless considering that our heartbeat method does 1 millisecond (HZ=2000, because of Nyquist) without any real problems and 100 microsecond (HZ=20000) with only a minor tradeoff in overall performance. The fact that PC hardware in general lacks usable hardware for the purpose is of course also a strong factor against. Compare the above sum for UNIX with that of a dedicated 5$ micro controller and you will understand why UNIX as an RTos is not a viable concept: The micro controller can guarantee that the first instruction of your code executes within N +/- M clock cycles where M is typically less than four and a clockcycle can be as little as 25nsec yielding 100nsec precision. If you spend more than $5 you get even better numbers. Far more productive for FreeBSD would be the implementation of a calibrated "nanodelay()" for device drivers to use. DELAY() has more problems than anyone care to list anymore. Poul-Henning [1] This is a basic fact of nature. When you look on your bedside alarmclock and it shows 05:59AM, you have no way of knowing how many seconds before the beep starts at 06:00AM. It can be any number from 1 to 60. When you time your daily jog by subtracting a "before" reading from an "after" reading, your uncertainty is twice that amount = +/- one minute. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.