Date: Thu, 22 Mar 2018 15:51:42 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Gleb Smirnoff <glebius@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r331328 - head/sys/kern Message-ID: <20180322144902.M1053@besplex.bde.org> In-Reply-To: <201803212321.w2LNLWa1059145@repo.freebsd.org> References: <201803212321.w2LNLWa1059145@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 21 Mar 2018, Gleb Smirnoff wrote: > Log: > Fix sysctl types broken in r329612. This is still broken. > Modified: head/sys/kern/vfs_bio.c > ============================================================================== > --- head/sys/kern/vfs_bio.c Wed Mar 21 23:17:26 2018 (r331327) > +++ head/sys/kern/vfs_bio.c Wed Mar 21 23:21:32 2018 (r331328) > @@ -261,17 +261,17 @@ SYSCTL_PROC(_vfs, OID_AUTO, numdirtybuffers, > "Number of buffers that are dirty (has unwritten changes) at the moment"); > static int lodirtybuffers; > SYSCTL_PROC(_vfs, OID_AUTO, lodirtybuffers, > - CTLTYPE_LONG|CTLFLAG_MPSAFE|CTLFLAG_RW, &lodirtybuffers, > + CTLTYPE_INT|CTLFLAG_MPSAFE|CTLFLAG_RW, &lodirtybuffers, > __offsetof(struct bufdomain, bd_lodirtybuffers), sysctl_bufdomain_int, "L", > "How many buffers we want to have free before bufdaemon can sleep"); Now the format letter "L" is even more inconsistent. The letter at least used to even more important. E.g., old versions of i386 machdep.tsc_freq got most things wrong, but reads appeared to work in most cases since the mismached CTLTYPE_QUAD was ignored by sysctl(8) and the format specifier "IU" was used to print the truncated value correctly. Writes appeared to work in most cases although short writes are documented to fail in all cases: from sysctl(3): [EINVAL] A non-null newp is given and its specified length in newlen is too large or too small. because detection of short writes has been broken for more than 20 years. For machdep.tsc_freq, CTLTYPE_QUAD gave 64-bit writes, but the sysctl_handle_int() in the PROC handler attempted to do a 32-bit and the broken short write detection allowed it to succeed. I think the same brokenness prevents detection of the current bug as a short write error -- CTLTYPE_ULONG gives 64-bit writes on 32-bit arches and sysctl_handle_int() in the PROC handler attempts to do a 32-bit short writes. Lower levels of sysctl know the size from the CTLTYPE but don't check it properly. Most PROC handlers don't check it at all, but hard-code sysctl_handle_int/etc(). This is reasonable since they can hard-code the type in this call just as easily as hard-coding it in the sysctl data (which is another bug, but harder to fix -- the size should be decided at runtime). In practice, there are inconsistencies. I use the following fix for short write detection: XX Index: kern_sysctl.c XX =================================================================== XX RCS file: /home/ncvs/src/sys/kern/kern_sysctl.c,v XX retrieving revision 1.157 XX diff -u -2 -r1.157 kern_sysctl.c XX --- kern_sysctl.c 11 Jun 2004 02:20:37 -0000 1.157 XX +++ kern_sysctl.c 11 Jun 2004 07:58:23 -0000 XX @@ -957,6 +949,13 @@ XX if (!req->newptr) XX return 0; XX - if (req->newlen - req->newidx < l) XX + if (req->newlen - req->newidx != l) { XX + if (req->newlen - req->newidx > l) { XX + printf( XX + "sysctl_new_kernel -- short write: %zu - %zu > %zu\n", XX + req->newlen, req->newidx, l); XX + Debugger("sysctl_new_kernel: short write"); XX + } XX return (EINVAL); XX + } XX bcopy((char *)req->newptr + req->newidx, p, l); XX req->newidx += l; XX @@ -1075,6 +1073,13 @@ XX if (!req->newptr) XX return 0; XX - if (req->newlen - req->newidx < l) XX + if (req->newlen - req->newidx != l) { XX + if (req->newlen - req->newidx > l) { XX + printf( XX + "sysctl_new_user -- short write: %zu - %zu > %zu\n", XX + req->newlen, req->newidx, l); XX + Debugger("sysctl_new_user: short write"); XX + } XX return (EINVAL); XX + } XX error = copyin((char *)req->newptr + req->newidx, p, l); XX req->newidx += l; The debugging code is of course only needed to detect inconsistent sysctl handlers and applications depend on the bug. Not many do. After fixing sysctl handlers, most applications work since they get the size using arcane sysctls like sysctl(8), or by reading the variable and assuming tha it can be written back using the same size. Not many applications probe for the size. They should, so that the sizes of at least scalar vairbales aren't part of an ABI and so can be changed easily. (See old mail for patches that add some probing to the kernel -- the kernel attempts to match the sizes. It truncates the sizes iff the value can be represented in the changed size and in some other cases like counters where truncation was the historical misbehaviour. Unfortunately, this interferes with applications probing the size.) The read size is easy to probe for, and it should be possible to probe for a write size that works, but the bug breaks that and corrupts kernel variables instead. The corruption is largest for the little-endian case. For machdep.tsc_freq, the only serious corruption was for changing a variable near the 4G boundary. E.g., if the variable is initially 0xffffffff, then attempting to change it to 0x100000000 actually changed it to 0, by changing only the low 32 bits. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180322144902.M1053>