From owner-svn-src-head@freebsd.org Wed Aug 9 08:44:23 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 24966DC6D7D; Wed, 9 Aug 2017 08:44:23 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id C7FB36858C; Wed, 9 Aug 2017 08:44:22 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 6A3EB3CCF6C; Wed, 9 Aug 2017 18:44:19 +1000 (AEST) Date: Wed, 9 Aug 2017 18:44:19 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Emmanuel Vadot cc: Bruce Evans , Emmanuel Vadot , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r322252 - head/usr.bin/vmstat In-Reply-To: <20170808161507.2d185d394dc921eac6729d89@bidouilliste.com> Message-ID: <20170809171526.B1567@besplex.bde.org> References: <201708081218.v78CIBvL068413@repo.freebsd.org> <20170808225104.I3528@besplex.bde.org> <20170808161507.2d185d394dc921eac6729d89@bidouilliste.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=LI0WeNe9 c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=Q9sN3FMeAAAA:8 a=3W0Di7-B9xzAe2TjPFYA:9 a=4JXGKAPf0pKDMlew:21 a=AfNcCDu4C7j3jd7Y:21 a=CjuIK1q_8ugA:10 a=mnDzqiP7QBHErLppGYsC:22 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Aug 2017 08:44:23 -0000 On Tue, 8 Aug 2017, Emmanuel Vadot wrote: > On Tue, 8 Aug 2017 23:55:52 +1000 (EST) > Bruce Evans wrote: > >> On Tue, 8 Aug 2017, Emmanuel Vadot wrote: >> >>> Log: >>> vmstat: Always emit a space after the free-memory column >>> >>> When displaying in non-human form, if the free-memory number >>> is large (more than 7 digits), there is no space between it and >>> the page fault column. >>> >>> PR: 221290 >>> Submitted by: Josuah Demangeon (Original version) >>> >>> Modified: >>> head/usr.bin/vmstat/vmstat.c >>> >>> Modified: head/usr.bin/vmstat/vmstat.c >>> ============================================================================== >>> --- head/usr.bin/vmstat/vmstat.c Tue Aug 8 11:49:36 2017 (r322251) >>> +++ head/usr.bin/vmstat/vmstat.c Tue Aug 8 12:18:11 2017 (r322252) >>> @@ -832,6 +832,7 @@ dovmstat(unsigned int interval, int reps) >>> xo_emit(" "); >>> xo_emit("{:free-memory/%7d}", >>> vmstat_pgtok(total.t_free)); >>> + xo_emit(" "); >>> } >>> xo_emit("{:total-page-faults/%5lu} ", >>> (unsigned long)rate(sum.v_vm_faults - >> >> This seems to break the formatting. There was a negative amount of space >> available for expansion, and since the header was not expanded to match >> its alignment with the fields is more random than before. With -h, the >> width was 80 columns, giving ugly line wrap on 80-column terminals with >> auto-wrap. Now it is 81 columns, giving uglier line wrap on all 80- >> column terminals. > > This break nothing, This was the case before too (with or without > -h), just tested in tmux with force-width 80. Testing shows that it breaks the -H case as expected. (I had misread the change as affecting the -h case too.) Before: X procs memory page disks faults cpu X r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id X 0 0 0 12719376 977460 2167 3 28 0 2545 451 0 0 268 27162 7175 9 1 91 Misformatting in the data line starts with the avm field taking 8 of the 7 columns reserved for it. This and all subsequent fields are misaligned with the header. The header is pre-misformatted to width 85, with lots of inconsistencies in the first line. Old versions were carefully formatted to width 79, with no inconsistencies in the first line. The extra width for avm makes the extra width for the header not even work to line up. X procs memory page disks faults cpu X r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id X 0 0 0 12719380 977460 2167 3 28 0 2545 451 0 0 268 27162 7175 9 1 91 The new bugs are that the extra space before the "flt" field increases the off-by-1 error to off-by-2 for this field and all subsequent fields. All fields starting with "flt" now have an error of at least off-by-1 even if none are too wide, since the header was not expanded to match. Expansion would increase its breakage from 6 to wide to 7 too wide. >> The bugs were mostly in the first line of the header: >> - the second line of the header was correct for vmstat -h >> - for vmstat without -h, the second line of the header was apparently broken >> by a change like the one here, that added a space after the "r b w" fields >> without adding one in the "r b w" header >> - most of the fields in the first line of the header are misaligned with the >> second lone. Many have drifted 3 to the left of where the were in a sort >> of center-justified place. Some of these might have actually been >> intended to be left justified, but had an off by +1 error. Now these >> have an error of off by -2 relative to left justifications. >> >> Only the "memory" header in the first line is better than in old versions. >> It is now left justified. Left justifying all headers in the first line >> is probably best. I couldn't find a good way to delimit the right hand >> side of the extents of the headers in the first line. The second line of >> the headers already uses right justification consistently and this works >> well. > > I think that all this might be true but you might talk about the whole > libxo conversion that was done, not my commit right ? I try not to look at libxo, since its misformattings are enormous starting in the source code. My columnize.awk script works almost perfectly to fix the above misformattings: X BEGIN { X columns = 79 # XXX X } X X { X # Determine fields and field widths for current line. X # awk's field splitting feature turned out to be inadequate, X # and this would be even easier in C. X n = split($0, a, "") X f = 1 X j = 1 X while (j <= n) { X cpw[f] = 0 X cfw[f] = 0 X while (j <= n && a[j] == " ") { X cpw[f]++ # current (left) padding width X j++ X } X if (j > n) X break X while (j <= n && a[j] != " ") { X cfw[f]++ # current field width without padding X j++ X } X fld[f] = substr($0, j - cfw[f], cfw[f]) X if (f != 1) { X cpw[f]-- X cfw[f]++ # current field width with min padding X } X f++ X } X nf = f - 1 # no need to use or trust NF X X if (NR == 1 || NR < 3 && nf != anf) { X # Make current field widths (with full padding) the maximum X # ones. When NR > 1, this discards the previous maximums. X # Good enough for vmstat, where the widths from NR == 1 are X # garbage. X # X # Make current field widths (with minimal padding) the minimum X # ones. X anf = nf X for (f = 1; f <= anf; f++) { X Mfw[f] = cpw[f] + cfw[f] X mfw[f] = cfw[f] X } X } else if (nf < anf) { X # Some non-tabular line after warming up. Probably an ornate X # line in the next header. Too hard to handle properly. X printf("%.*s oops\n", columns - 5, $0) X fflush(stdout) X next X } else if (nf > anf) { X # Some non-tabular line after warming up. Hopefully just X # a single trailing field with inadequate field separators X # like the COMMAND field in ps and top. Append the extras X # to the last field with minimal separators. X for (f = anf + 1; f <= nf; f++) X fld[anf] = fld[anf] " " fld[f] X nf = anf X cfw[nf] = 1 + length(fld[nf]) X } X X # Update the maximum and minimum field widths if this line needs X # wider fields. X len = 0 X for (f = 1; f <= nf; f++) { X if (Mfw[f] < cfw[f]) X Mfw[f] = cfw[f] X len += Mfw[f] X if (mfw[f] < cfw[f]) X mfw[f] = cfw[f] X } X X # If the line would be too wide, reduce the maximum field widths X # minimally towards minimum ones. X while (len > columns) { X for (f = 1; f <= nf; f++) { X if (Mfw[f] > mfw[f]) { X Mfw[f]-- X len-- X break X } X } X if (f > nf) X break X } X X s = "" X for (f = 1; f <= nf - 1; f++) X s = s sprintf("%*s", Mfw[f], fld[f]) X s = s sprintf(" %-*s", Mfw[nf] - 1, fld[nf]) X printf("%.*s\n", columns, s) X fflush(stdout) X } When applied to the above saved output (old then new, quoted by X's), it produces: X procs memory page disks faults cpu X r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id X 0 0 0 12719376 977460 2167 3 28 0 2545 451 0 0 268 27162 7175 9 1 91 X procs memory page disks faults oops X r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id X 0 0 0 12719380 977460 2167 3 28 0 2545 451 0 0 268 27162 7175 9 1 91 It removed enough spaces to fit in 79 columns even with the X's. It can't handle the first line of the header very well since this line has an ornate format whose columns don't match the rest of the output in any simple way (programs should produce more parseable formats, like ps does). All that that it did for the first line was move "cpu" to a less worse place in the old output and change it to "oops" in the new output. Even programs like ps and df that attempt to do the necessary dynamic formatting often produce worse results than the above script. df on freefall now produces a mess lots of spaces giving long lines even with -h, I think because the mountpoint names are very long and df has hard- coded formats for everything else. columnize.awk only manages to almost clean this up for the -h case. It has a problem that it is stream-based and doesn't backtrack to adjust early lines. This is necessary for use in filters. vmstat 1 | awk -f columize.awk is a good example of using columize.awk in a filter. This use also requires it to understand headers not in the first few lines of output. This is not easy, and in practice it gets confused by the first line of the vmstat header unless it is the first line of the output. This is why it printed "oops" in the abov concatenated output. The first header line just has an unusual number of fields so cannot be handled right. When it is the first line in the output, it is printed verbatim except possibly for removing spaces, since it gives the only available information on the expected columnar output. Bruce