From owner-freebsd-i386@FreeBSD.ORG Sun Nov 14 09:50:13 2004 Return-Path: Delivered-To: freebsd-i386@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2949816A4CE; Sun, 14 Nov 2004 09:50:13 +0000 (GMT) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4630243D55; Sun, 14 Nov 2004 09:50:12 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])iAE9oAKP029176; Sun, 14 Nov 2004 20:50:10 +1100 Received: from epsplex.bde.org (katana.zip.com.au [61.8.7.246]) iAE9ns7c020580; Sun, 14 Nov 2004 20:50:05 +1100 Date: Sun, 14 Nov 2004 20:49:55 +1100 (EST) From: Bruce Evans X-X-Sender: bde@epsplex.bde.org To: Ted Schundler In-Reply-To: <200411140301.iAE31n1e003404@www.freebsd.org> Message-ID: <20041114173708.F1082@epsplex.bde.org> References: <200411140301.iAE31n1e003404@www.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-gnats-submit@freebsd.org cc: freebsd-i386@freebsd.org Subject: Re: i386/73921: sysctlbyname for machdep.tsc_freq doesn't handle speeds > 2^31 Hz properly X-BeenThere: freebsd-i386@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: I386-specific issues for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Nov 2004 09:50:13 -0000 On Sun, 14 Nov 2004, Ted Schundler wrote: > >Description: > In an app that needs the CPU speed, I'm calling sysctlbyname("machdep.tsc_freq",.... with an 64-bit long long (so when 5GHz CPUs come around, it should still work). However I'm getting very large numbers are a result. It seems that internally it is getting the value as a 32 bit integer and converting that to 64 bits by extending the sign. It shouldn't be 32 bits to begin with imho, but even if it is, it should at least be treated as unsigned. Please limit line lengths to considerably less than 446 columns. > >How-To-Repeat: > On a sufficiently fast system (in my case 3.1Ghz): > > unsigned long long sysctl_data; > bl=8; > sysctlbyname("machdep.tsc_freq", &sysctl_data,&bl,NULL,0); > printf("cpuspeed: %llu\n",sysctl_data); > > result: > cpuspeed: 13817017134246676192 > > should be: > spuspeed: 3192014560 tsc_freq was converted to 64 bits in the kernel to support frequencies >= 2^32, but the associated changes for the sysctl are mostly wrong. The sysctl only properly handles the lower 32 bits of tsc_freq. In particular, it returns garbage in the top bits. Your output shows the garbage, but this is partly a bug in test program (you should check final value in `bl'; other bugs make it 4 instead of the expected 8, so you shouldn't use the upper 4 bytes. This is how sysctl(1) avoids seeing the bug). I use the following incomplete fixes and tests: %%% Index: tsc.c =================================================================== RCS file: /home/ncvs/src/sys/i386/i386/tsc.c,v retrieving revision 1.204 diff -u -2 -r1.204 tsc.c --- tsc.c 21 Oct 2003 18:28:34 -0000 1.204 +++ tsc.c 14 Nov 2004 06:20:12 -0000 @@ -131,10 +131,12 @@ sysctl_machdep_tsc_freq(SYSCTL_HANDLER_ARGS) { + u_int freq; int error; - uint64_t freq; if (tsc_timecounter.tc_frequency == 0) return (EOPNOTSUPP); freq = tsc_freq; + if (freq != tsc_freq) + return (EOVERFLOW); error = sysctl_handle_int(oidp, &freq, sizeof(freq), req); if (error == 0 && req->newptr != NULL) { @@ -145,5 +147,5 @@ } -SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_QUAD | CTLFLAG_RW, +SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_UINT | CTLFLAG_RW, 0, sizeof(u_int), sysctl_machdep_tsc_freq, "IU", ""); @@ -153,2 +155,38 @@ return (rdtsc()); } + +uint64_t tsc_freq1 = 0x0000000100000002; + +static int +sysctl_machdep_tsc_freq1(SYSCTL_HANDLER_ARGS) +{ + uint64_t freq; + int error; + + freq = tsc_freq1; + error = sysctl_handle_opaque(oidp, &freq, sizeof(freq), req); + if (error == 0 && req->newptr != NULL) + tsc_freq1 = freq; + return (error); +} + +SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq1, CTLTYPE_QUAD | CTLFLAG_RW, + 0, sizeof(uint64_t), sysctl_machdep_tsc_freq1, "QU", ""); + +uint64_t tsc_freq2 = 0x0000000100000002; + +static int +sysctl_machdep_tsc_freq2(SYSCTL_HANDLER_ARGS) +{ + uint64_t freq; + int error; + + freq = tsc_freq2; + error = sysctl_handle_opaque(oidp, &freq, sizeof(freq), req); + if (error == 0 && req->newptr != NULL) + tsc_freq2 = freq; + return (error); +} + +SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq2, CTLTYPE_QUAD | CTLFLAG_RW, + 0, sizeof(uint64_t), sysctl_machdep_tsc_freq2, "IU", ""); %%% Details: - in sysctl_machdep_tsc_freq() and its SYSCTL_PROC(), everything is reduced to u_int's so that other bugs aren't activated. Support for frequencies >= 2^32 remains mostly broken. It works in the kernel, but such frequencies cannot be set or read from userland. - sysctl_machdep_tsc_freq1() is progression test. It shows how to give correct support for 63-bit (sic) tsc_freq's. It depends on unbreaking support for printing quad_t's in sysctl(1) using the "Q" format specifier. - sysctl_machdep_tsc_freq2() shows the best that can be done without fixing sysctl(1). sysctl(1) in -current does useless things for the "Q" format specifier; the best that it can do is print a 64-bit quad_t as 2 32-bit ints. Sample output: %%% machdep.tsc_freq: 2222929820 machdep.tsc_freq1: 4294967298 # requires support for "Q" format specifier machdep.tsc_freq2: 2 1 %%% There are many bugs in error handling that inhibit detection of of the main bug: - `void *'s in the SYSCTL_* macros defeat compile-time type checking - missing runtime checking in sysctl_handle_int() and friends. sysctl_handle_int() silently ignores its length arg. It blindly handles sizeof(int) bytes instead of the requested length. - breakage of documented error handling. From sysctl(3): % [EINVAL] A non-null newp is given and its specified length in % newlen is too large or too small. The size parameter in the SYSCTL_PROC() for machdep.tsc_freq has apparently been hacked to make the number of bytes returned by the sysctl 4 instead of 8. This masks the bug for printing the value in sysctl(1) provided the value fits in 4 bytes. However, it makes input inconsistent with output. For input, sysctl(1) uses the CTLBYTE parameter and ignores the size parameter. It tries to write 8 bytes to the kernel, but sysctl_handle_int() only writes 4 bytes. This error is silently ignored. Note that unlike for write(2), short writes cannot be reported to userland except as an error, since the sysctl(3) interface only returns a count for reads; thus short writes for sysctl(3) must cause an error as documented. I have only fixed the error handling for short writes: %%% Index: kern_sysctl.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_sysctl.c,v retrieving revision 1.157 diff -u -2 -r1.157 kern_sysctl.c --- kern_sysctl.c 11 Jun 2004 02:20:37 -0000 1.157 +++ kern_sysctl.c 11 Jun 2004 07:58:23 -0000 @@ -957,6 +949,13 @@ if (!req->newptr) return 0; - if (req->newlen - req->newidx < l) + if (req->newlen - req->newidx != l) { + if (req->newlen - req->newidx > l) { + printf( + "sysctl_new_kernel -- short write: %zu - %zu > %zu\n", + req->newlen, req->newidx, l); + Debugger("sysctl_new_kernel: short write"); + } return (EINVAL); + } bcopy((char *)req->newptr + req->newidx, p, l); req->newidx += l; @@ -1075,6 +1073,13 @@ if (!req->newptr) return 0; - if (req->newlen - req->newidx < l) + if (req->newlen - req->newidx != l) { + if (req->newlen - req->newidx > l) { + printf( + "sysctl_new_user -- short write: %zu - %zu > %zu\n", + req->newlen, req->newidx, l); + Debugger("sysctl_new_user: short write"); + } return (EINVAL); + } error = copyin((char *)req->newptr + req->newidx, p, l); req->newidx += l; %%% Bruce