Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Nov 2004 20:49:55 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Ted Schundler <tschundler@scu.edu>
Cc:        freebsd-i386@freebsd.org
Subject:   Re: i386/73921: sysctlbyname for machdep.tsc_freq doesn't handle speeds > 2^31 Hz properly
Message-ID:  <20041114173708.F1082@epsplex.bde.org>
In-Reply-To: <200411140301.iAE31n1e003404@www.freebsd.org>
References:  <200411140301.iAE31n1e003404@www.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 14 Nov 2004, Ted Schundler wrote:

> >Description:
> In an app that needs the CPU speed, I'm calling sysctlbyname("machdep.tsc_freq",.... with an 64-bit long long (so when 5GHz CPUs come around, it should still work). However I'm getting very large numbers are a result. It seems that internally it is getting the value as a 32 bit integer and converting that to 64 bits by extending the sign. It shouldn't be 32 bits to begin with imho, but even if it is, it should at least be treated as unsigned.

Please limit line lengths to considerably less than 446 columns.

> >How-To-Repeat:
> On a sufficiently fast system (in my case 3.1Ghz):
>
>   unsigned long long sysctl_data;
>   bl=8;
>   sysctlbyname("machdep.tsc_freq", &sysctl_data,&bl,NULL,0);
>   printf("cpuspeed: %llu\n",sysctl_data);
>
> result:
> cpuspeed: 13817017134246676192
>
> should be:
> spuspeed: 3192014560

tsc_freq was converted to 64 bits in the kernel to support frequencies
>= 2^32, but the associated changes for the sysctl are mostly wrong.
The sysctl only properly handles the lower 32 bits of tsc_freq.  In
particular, it returns garbage in the top bits.  Your output shows the
garbage, but this is partly a bug in test program (you should check
final value in `bl'; other bugs make it 4 instead of the expected 8,
so you shouldn't use the upper 4 bytes.  This is how sysctl(1) avoids
seeing the bug).

I use the following incomplete fixes and tests:

%%%
Index: tsc.c
===================================================================
RCS file: /home/ncvs/src/sys/i386/i386/tsc.c,v
retrieving revision 1.204
diff -u -2 -r1.204 tsc.c
--- tsc.c	21 Oct 2003 18:28:34 -0000	1.204
+++ tsc.c	14 Nov 2004 06:20:12 -0000
@@ -131,10 +131,12 @@
 sysctl_machdep_tsc_freq(SYSCTL_HANDLER_ARGS)
 {
+	u_int freq;
 	int error;
-	uint64_t freq;

 	if (tsc_timecounter.tc_frequency == 0)
 		return (EOPNOTSUPP);
 	freq = tsc_freq;
+	if (freq != tsc_freq)
+		return (EOVERFLOW);
 	error = sysctl_handle_int(oidp, &freq, sizeof(freq), req);
 	if (error == 0 && req->newptr != NULL) {
@@ -145,5 +147,5 @@
 }

-SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_QUAD | CTLFLAG_RW,
+SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_UINT | CTLFLAG_RW,
     0, sizeof(u_int), sysctl_machdep_tsc_freq, "IU", "");

@@ -153,2 +155,38 @@
 	return (rdtsc());
 }
+
+uint64_t tsc_freq1 = 0x0000000100000002;
+
+static int
+sysctl_machdep_tsc_freq1(SYSCTL_HANDLER_ARGS)
+{
+	uint64_t freq;
+	int error;
+
+	freq = tsc_freq1;
+	error = sysctl_handle_opaque(oidp, &freq, sizeof(freq), req);
+	if (error == 0 && req->newptr != NULL)
+		tsc_freq1 = freq;
+	return (error);
+}
+
+SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq1, CTLTYPE_QUAD | CTLFLAG_RW,
+    0, sizeof(uint64_t), sysctl_machdep_tsc_freq1, "QU", "");
+
+uint64_t tsc_freq2 = 0x0000000100000002;
+
+static int
+sysctl_machdep_tsc_freq2(SYSCTL_HANDLER_ARGS)
+{
+	uint64_t freq;
+	int error;
+
+	freq = tsc_freq2;
+	error = sysctl_handle_opaque(oidp, &freq, sizeof(freq), req);
+	if (error == 0 && req->newptr != NULL)
+		tsc_freq2 = freq;
+	return (error);
+}
+
+SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq2, CTLTYPE_QUAD | CTLFLAG_RW,
+    0, sizeof(uint64_t), sysctl_machdep_tsc_freq2, "IU", "");
%%%

Details:
- in sysctl_machdep_tsc_freq() and its SYSCTL_PROC(), everything is
  reduced to u_int's so that other bugs aren't activated.  Support
  for frequencies >= 2^32 remains mostly broken.  It works in the kernel,
  but such frequencies cannot be set or read from userland.
- sysctl_machdep_tsc_freq1() is progression test.  It shows how to give
  correct support for 63-bit (sic) tsc_freq's.  It depends on unbreaking
  support for printing quad_t's in sysctl(1) using the "Q" format
  specifier.
- sysctl_machdep_tsc_freq2() shows the best that can be done without
  fixing sysctl(1).  sysctl(1) in -current does useless things for the
  "Q" format specifier; the best that it can do is print a 64-bit quad_t
  as 2 32-bit ints.

Sample output:

%%%
machdep.tsc_freq: 2222929820
machdep.tsc_freq1: 4294967298    # requires support for "Q" format specifier
machdep.tsc_freq2: 2 1
%%%

There are many bugs in error handling that inhibit detection of of the main
bug:
- `void *'s in the SYSCTL_* macros defeat compile-time type checking
- missing runtime checking in sysctl_handle_int() and friends.
  sysctl_handle_int() silently ignores its length arg.  It blindly
  handles sizeof(int) bytes instead of the requested length.
- breakage of documented error handling.  From sysctl(3):

%      [EINVAL]           A non-null newp is given and its specified length in
%                         newlen is too large or too small.

  The size parameter in the SYSCTL_PROC() for machdep.tsc_freq has
  apparently been hacked to make the number of bytes returned by the
  sysctl 4 instead of 8.  This masks the bug for printing the value in
  sysctl(1) provided the value fits in 4 bytes.  However, it makes input
  inconsistent with output.  For input, sysctl(1) uses the CTLBYTE
  parameter and ignores the size parameter.  It tries to write 8
  bytes to the kernel, but sysctl_handle_int() only writes 4 bytes.
  This error is silently ignored.  Note that unlike for write(2),
  short writes cannot be reported to userland except as an error,
  since the sysctl(3) interface only returns a count for reads; thus
  short writes for sysctl(3) must cause an error as documented.

I have only fixed the error handling for short writes:

%%%
Index: kern_sysctl.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_sysctl.c,v
retrieving revision 1.157
diff -u -2 -r1.157 kern_sysctl.c
--- kern_sysctl.c	11 Jun 2004 02:20:37 -0000	1.157
+++ kern_sysctl.c	11 Jun 2004 07:58:23 -0000
@@ -957,6 +949,13 @@
 	if (!req->newptr)
 		return 0;
-	if (req->newlen - req->newidx < l)
+	if (req->newlen - req->newidx != l) {
+		if (req->newlen - req->newidx > l) {
+			printf(
+		"sysctl_new_kernel -- short write: %zu - %zu > %zu\n",
+			    req->newlen, req->newidx, l);
+			Debugger("sysctl_new_kernel: short write");
+		}
 		return (EINVAL);
+	}
 	bcopy((char *)req->newptr + req->newidx, p, l);
 	req->newidx += l;
@@ -1075,6 +1073,13 @@
 	if (!req->newptr)
 		return 0;
-	if (req->newlen - req->newidx < l)
+	if (req->newlen - req->newidx != l) {
+		if (req->newlen - req->newidx > l) {
+			printf(
+		"sysctl_new_user -- short write: %zu - %zu > %zu\n",
+			    req->newlen, req->newidx, l);
+			Debugger("sysctl_new_user: short write");
+		}
 		return (EINVAL);
+	}
 	error = copyin((char *)req->newptr + req->newidx, p, l);
 	req->newidx += l;
%%%

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041114173708.F1082>