From owner-freebsd-arch@FreeBSD.ORG Thu Aug 8 23:17:21 2013 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 8D829B6A; Thu, 8 Aug 2013 23:17:21 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 3806C2A43; Thu, 8 Aug 2013 23:17:21 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id A972CD448D7; Fri, 9 Aug 2013 09:17:15 +1000 (EST) Date: Fri, 9 Aug 2013 09:17:13 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Mark R V Murray Subject: Re: random(4) plugin infrastructure for mulitple RNG in a modular fashion In-Reply-To: <50BE6942-CC39-413C-8E14-C6B93440901B@grondar.org> Message-ID: <20130809081923.N1044@besplex.bde.org> References: <20130807182858.GA79286@dragon.NUXI.org> <20130807192736.GA7099@troutmask.apl.washington.edu> <5203968D.7060508@freebsd.org> <7018AAA9-0A88-430F-96B7-867E5F529B36@bsdimp.com> <50BE6942-CC39-413C-8E14-C6B93440901B@grondar.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=DstvpgP+ c=1 sm=1 tr=0 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 a=PO7r1zJSAAAA:8 a=zRGD7RnwAEsA:10 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=wdFzHWp-jKkA:10 a=AeKwieAPptnafv6lJtYA:9 a=CjuIK1q_8ugA:10 Cc: Arthur Mesh , Steve Kargl , secteam@FreeBSD.org, freebsd-arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Aug 2013 23:17:21 -0000 On Thu, 8 Aug 2013, Mark R V Murray wrote: > I still want to get back something like the original get_cyclecount(); simple and quick. I don't care what its called, but out doesn't need to be the massive thing that the current get_cyclecount() has grown to be on x86. rdtsc(), I think it was. The simple and quick version cannot exist, and never did. The original i386 version was: 1.50 (markm 21-Nov-00): /* 1.50 (markm 21-Nov-00): * Return contents of in-cpu fast counter as a sort of "bogo-time" 1.50 (markm 21-Nov-00): * for non-critical timing. 1.50 (markm 21-Nov-00): */ 1.50 (markm 21-Nov-00): static __inline u_int64_t 1.50 (markm 21-Nov-00): get_cyclecount(void) 1.50 (markm 21-Nov-00): { 1.50 (markm 21-Nov-00): #if defined(I386_CPU) || defined(I486_CPU) 1.50 (markm 21-Nov-00): struct timespec tv; 1.50 (markm 21-Nov-00): 1.50 (markm 21-Nov-00): if ((cpu_feature & CPUID_TSC) == 0) { 1.50 (markm 21-Nov-00): nanotime(&tv); 1.50 (markm 21-Nov-00): return (tv.tv_sec * (u_int64_t)1000000000 + tv.tv_nsec); 1.50 (markm 21-Nov-00): } 1.50 (markm 21-Nov-00): #endif 1.50 (markm 21-Nov-00): return (rdtsc()); 1.50 (markm 21-Nov-00): } This is not so simple, and is unquick if there is no TSC. If I386_CPU or I486_CPU is configured, then it is suboptimal even if there is a TSC. Other arches are even further from always having a TSC. The simple and quvck version would always return 0 or a kernel global like time.tv_nsec if there is no TSC and no other readable freqently changing timer or noise source that can be read almost as fast as memory. It wouldn't guarantee any entropy. The current version is only slightly unsimpler and unquicker: - on amd64, it is still just inline rdtsc() On other versions, the nanotime() in it was first improved to binuptime(). This also gave more noise in the extra low bits, and mixing of the bits made it less abusable as a timer. The latter has been broken on some arches. - on arm, the bits are still mixed by ((sec << 56) | (frac >> 8)) (8 bits of sec and 56 bits of frac. I don't like losing some low bits (it is better to xor things), but the result is fairly unusable as a timer and perhaps there is nothing useful in the low bits on arm (it takes a very high frequency clock like a TSC and/or delicate ntpd adjustments that aren't very noisy to put anything there). The 8-bit seconds count isn't too good when KTR abuses get_cyclecount.(). - on i386, read_cycleount() is still inline, but the inline just calls the function pointer cpu_tick(). If there is a TSC, then cpu_tick points to an un-inline rdtsc() and the result is a slightly pessimzed version of the above if I386_CPU or I486_CPU is configured and a more pessimized version of the above if neither is configured. Otherwise, the result is the accumulated tick count of the currently active timecounter. This is much better for noise in get_cyclecount() and much worse for its primary purpose of timing than is binuptime() with bits mixed to form a timer. The active timecounter can change, and then the frequency and offset of its ticker changes. Its primary use is for process times, and there is some recalibration for this, but this is incomplete and buggy. But for get_cyclecount(), the noise is a feature. The noise from this is bad when KTR abuses get_cyclecount(). Otherwise, this is better for get_cyclecount() than the old binuptime() method. - on ia64, get_cyclecount is #define'd as another function. The declaration and definition of the other function are even more obscure. They are generated by a macro. Standard namespace pollution in sys/systm.h is depended on to join the definitions. - mips is like ia64 except the obfuscation chain is shorter. provides its own namespace pollution, so sys/systm.h and its pollution aren't depended on... - on powerpc, get_cyclecount() reads a counter using inline asm. It spells the 2 32-bit components of the counter as essentially time._upper and time._lower, so it isn't clear if they are actually times to begin with. - sparc64 uses inline asm to read some register which is hopefully a counter. So, get_cyclecount() is actually simple and quick (except for macros hiding the simplicity) on all arches except arm and old i386. But it is very MD, so it takes a lot of code with different simplicity to support it for all arches. Still better than #ifdefing it wherever it is used. Bruce