Date: Tue, 2 May 2017 15:17:11 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Bruce Evans <brde@optusnet.com.au> Cc: Alan Somers <asomers@freebsd.org>, Gleb Smirnoff <glebius@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org> Subject: Re: svn commit: r317061 - in head: libexec/rpc.rstatd sys/amd64/amd64 sys/amd64/include sys/arm/arm sys/arm/include sys/arm64/include sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/compat/linprocfs... Message-ID: <20170502121711.GE1622@kib.kiev.ua> In-Reply-To: <20170502203703.I1176@besplex.bde.org> References: <201704171734.v3HHYlf5022945@repo.freebsd.org> <CAOtMX2jdNj0du0ZuUKPr16iHK_YeNVzf-nDvwC-MuFM003VVAg@mail.gmail.com> <20170419130428.J956@besplex.bde.org> <20170430201324.GV1622@kib.kiev.ua> <20170501163725.U972@besplex.bde.org> <20170502095527.GB1622@kib.kiev.ua> <20170502203703.I1176@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, May 02, 2017 at 09:25:40PM +1000, Bruce Evans wrote: > On Tue, 2 May 2017, Konstantin Belousov wrote: > > > On Mon, May 01, 2017 at 06:25:11PM +1000, Bruce Evans wrote: > >> XX Index: subr_counter.c > >> XX =================================================================== > >> XX --- subr_counter.c (revision 317604) > >> XX +++ subr_counter.c (working copy) > >> XX @@ -78,11 +78,15 @@ > >> XX sysctl_handle_counter_u64(SYSCTL_HANDLER_ARGS) > >> XX { > >> XX uint64_t out; > >> XX + uint32_t out32; > >> XX int error; > >> XX > >> XX out = counter_u64_fetch(*(counter_u64_t *)arg1); > >> XX > >> XX - error = SYSCTL_OUT(req, &out, sizeof(uint64_t)); > >> XX + if (req->oldlen == 4 && (out32 = out) == out) > >> XX + error = SYSCTL_OUT(req, &out32, sizeof(out32)); > >> XX + else > >> XX + error = SYSCTL_OUT(req, &out, sizeof(out)); > >> XX > >> XX if (error || !req->newptr) > >> XX return (error); > >> > >> This does the minimum necessary to fix the current problem. > >> > >> This works until the counters become too large for a u_int. There > >> could be an option to get truncation with no error, but that would > >> require an API change, so applications should be required to do the > >> truncation for themself if that is what they want. > > > > Yes, this is approximately what I consider to be enough fix, but with > > some details which I do not like and which did not worked well on my > > test machine with enough memory to generate lot of increments. > > > > Below is what I consider to be good enough. I will proceed with this > > version unless very strong objections appear. > > I prefer my version. It is the same as for sysctl_bufspace(), except > for simplifications and bug fixes that are easier because the types > are unsigned. > > Your version is especially broken for the case of a lot of increments. You consider the 'differentiators' as the main users, but I only want systat and top to not abort, I have no intent of making them work for updated counters if overflown. Instead, I want something human-visible so that person looking at raw counters values using old tools detected the failure. > > Clamping breaks many cases in systat -v which just needs unsigned counters > to calculate small differences. Not many counters overflow faster than > the refresh interval in systat or top. These cases used to appear to > work perfectly for most counters when the kernel did blind truncations. > Clamping breaks the API. Not returing ENOMEM breaks the API. > > In my version, this is handled as specified by the API by truncating and > returning ENOMEM. Then: > - top sees the ENOMEM and mishandles it by aborting > - systat sees the ENOMEM (and also sees a short count, if it has the newer > breakage of hard-coding 64-bit counters instead of u_int ones, and is > run on a kernel with u_int counters) and mishandles it better by > printing an error message but not aborting. Thus it appears to work > just as well as when the kernel did blind truncation, except for the > error message. ENOMEM is, of course, the situation which I want to avoid. > > Clamping would be difficult for the application to handle even if an > error is returned. This would be like the situation for clamping for > strtoul(), but worse since it changes the API. Many applications don't > understand how to handle errno for strtoul(), but they do do a bogus > check of the clamped value. For sysctl(), clamping is not part of > the API now, so applications shouldn't check for the clamped values. > Ones that work with deltas like systat -v would see the counters stop > incrementing when the clamped value is reached, while truncated values > allow them to continue to appear to work perfectly. > > > diff --git a/sys/vm/vm_meter.c b/sys/vm/vm_meter.c > > index 5f4cd46ab1e..93d5161de89 100644 > > --- a/sys/vm/vm_meter.c > > +++ b/sys/vm/vm_meter.c > > @@ -266,8 +266,29 @@ static SYSCTL_NODE(_vm_stats, OID_AUTO, vm, CTLFLAG_RW, 0, > > "VM meter vm stats"); > > SYSCTL_NODE(_vm_stats, OID_AUTO, misc, CTLFLAG_RW, 0, "VM meter misc stats"); > > > > +static int > > +sysctl_handle_vmstat(SYSCTL_HANDLER_ARGS) > > +{ > > + uint64_t out; > > +#ifdef COMPAT_FREEBSD11 > > + uint32_t out1; > > +#endif > > The extra wrapper and the ifdef limit the damage from the API change. > Otherwise, I don't like them. > > I prefer my variable name out32 to the nondescript out1. ('out' should > also be renamed out64, but I wanted to keep the diffs small.) Changed to the out32 name. > > > + > > + out = counter_u64_fetch(*(counter_u64_t *)arg1); > > +#ifdef COMPAT_FREEBSD11 > > + if (req->oldlen == sizeof(uint32_t)) { > > sizeof(uint32_t) is an obfuscated spelling of '4'. The size is already > hard-coded in '32'. This depends on sizeof(char) being 1. It is, and > won't change. POSIX started specifying this in 2001, and the mere > existence of uint8_t probably requires 8-bit chars since u_char must > be the smallest type of an object. This depends on the size of a u_int > being 32 bits/4 bytes in the old ABI. This won't change, and is already > hard-coded as 32, as it must be because sizeof(u_int) may change. I prefer the sizeof over 4. > > > + out1 = 0xffffffff; > > + if (out < out1) > > + out1 = out; > > This does the clamping. Clamping is probably better expressed as: > > out1 = MIN(out, 0xffffffff) Changed as well. > > > + return (SYSCTL_OUT(req, &out1, sizeof(uint32_t))); > > Here sizeof(uint32_t) is an obfuscated spelling of sizeof(out1). I think > the sizeof() is justified in this case, to limit the hard-coding to of > sizes to 1 place. Truncation by assigment to the destination variable also > helps limit this > (sysctl_bufspace() could also be improved by truncation by assignment. > Its integers are signed, so the result is implementation-defined, but > we know that it is benign 2's complement and don't really care what it > is provided it is reasonable. This is better than laboriously checking > the lower limit or neglecting to check it). > Here the limit 0xffffffff also depends on the size. Yes, there is no UINT32_MAX, and never will. > > > + } > > +#endif > > + return (SYSCTL_OUT(req, &out, sizeof(uint64_t))); > > +} > > + > > #define VM_STATS(parent, var, descr) \ > > - SYSCTL_COUNTER_U64(parent, OID_AUTO, var, CTLFLAG_RD, &vm_cnt.var, descr) > > + SYSCTL_OID(parent, OID_AUTO, var, CTLTYPE_U64 | CTLFLAG_MPSAFE | \ > > + CTLFLAG_RD, &vm_cnt.var, 0, sysctl_handle_vmstat, "QU", descr); > > I see a new problem, shared with my version. The sysctl data says that this > sysctl is pure 64 bits, but it can return 32 bits. The sysctl data just > can't describe the variation. I guess this is not much a problem, since it > takes magic to read the sysctl data from the kernel and not many programs > except sysctl know how to do it. There programs will see the pure 64 > bits and only attempt to use 64 bit types. It can return 32bit only on improper use, which we allow for ABI compat. So I do not see this as a problem either. diff --git a/sys/vm/vm_meter.c b/sys/vm/vm_meter.c index 5f4cd46ab1e..67f24609b5a 100644 --- a/sys/vm/vm_meter.c +++ b/sys/vm/vm_meter.c @@ -266,8 +266,27 @@ static SYSCTL_NODE(_vm_stats, OID_AUTO, vm, CTLFLAG_RW, 0, "VM meter vm stats"); SYSCTL_NODE(_vm_stats, OID_AUTO, misc, CTLFLAG_RW, 0, "VM meter misc stats"); +static int +sysctl_handle_vmstat(SYSCTL_HANDLER_ARGS) +{ + uint64_t out; +#ifdef COMPAT_FREEBSD11 + uint32_t out32; +#endif + + out = counter_u64_fetch(*(counter_u64_t *)arg1); +#ifdef COMPAT_FREEBSD11 + if (req->oldlen == sizeof(uint32_t)) { + out32 = MIN(out, 0xffffffff); + return (SYSCTL_OUT(req, &out32, sizeof(uint32_t))); + } +#endif + return (SYSCTL_OUT(req, &out, sizeof(uint64_t))); +} + #define VM_STATS(parent, var, descr) \ - SYSCTL_COUNTER_U64(parent, OID_AUTO, var, CTLFLAG_RD, &vm_cnt.var, descr) + SYSCTL_OID(parent, OID_AUTO, var, CTLTYPE_U64 | CTLFLAG_MPSAFE | \ + CTLFLAG_RD, &vm_cnt.var, 0, sysctl_handle_vmstat, "QU", descr); #define VM_STATS_VM(var, descr) VM_STATS(_vm_stats_vm, var, descr) #define VM_STATS_SYS(var, descr) VM_STATS(_vm_stats_sys, var, descr)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170502121711.GE1622>