From owner-svn-src-all@freebsd.org  Wed Aug  9 09:40:52 2017
Return-Path: <owner-svn-src-all@freebsd.org>
Delivered-To: svn-src-all@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4D426DC81CB;
 Wed,  9 Aug 2017 09:40:52 +0000 (UTC)
 (envelope-from manu@bidouilliste.com)
Received: from mail.blih.net (mail.blih.net [212.83.177.182])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "mail.blih.net", Issuer "mail.blih.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 62D796AC6D;
 Wed,  9 Aug 2017 09:40:50 +0000 (UTC)
 (envelope-from manu@bidouilliste.com)
Received: from mail.blih.net (mail.blih.net [212.83.177.182])
 by mail.blih.net (OpenSMTPD) with ESMTP id 622c0cbf;
 Wed, 9 Aug 2017 11:40:46 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=bidouilliste.com; h=date
 :from:to:cc:subject:message-id:in-reply-to:references
 :mime-version:content-type:content-transfer-encoding; s=mail;
 bh=YHlYNx1ueOQ1mJD02d0Dw2TKYKQ=; b=kD0Uz7hXb5dY/sojLfBufpJwbjW8
 bWszu/vcA9QwgqEqj7DDL+kEFnjv8IvLC9DPLoX5DQkCeACKeyR7XxJvcYpRs6qj
 pR96z7PxUpW1t/wp2rR3zlaZSgebVINSsim8U3KU+QjC/JUQI9OlU/ZnQOUiwxKj
 FR72iA/4KeRPHc8=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=bidouilliste.com; h=date
 :from:to:cc:subject:message-id:in-reply-to:references
 :mime-version:content-type:content-transfer-encoding; q=dns; s=
 mail; b=MKR1o5mjiBJ13SopBz044PfeZTniB86PKJPU0zND2ZFg96j1lQg93vqk
 7FthG1dNpH05XoWeWzslAI4REgSo5ezChoQPtSL6cyjQVgWWoyKnjBZi8MvNtUzJ
 S1EyPSAeLySguhm1J/ptEX8ArIki8o7tH7msqQ0XmGe/9oBlHWA=
Received: from arcadia (evadot.gandi.net [217.70.181.36])
 by mail.blih.net (OpenSMTPD) with ESMTPSA id d970e6e3
 TLS version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO;
 Wed, 9 Aug 2017 11:40:46 +0200 (CEST)
Date: Wed, 9 Aug 2017 11:40:46 +0200
From: Emmanuel Vadot <manu@bidouilliste.com>
To: Bruce Evans <brde@optusnet.com.au>
Cc: Emmanuel Vadot <manu@freebsd.org>, src-committers@freebsd.org,
 svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject: Re: svn commit: r322252 - head/usr.bin/vmstat
Message-Id: <20170809114046.6e55bb881d98d19f8b88ddc2@bidouilliste.com>
In-Reply-To: <20170809171526.B1567@besplex.bde.org>
References: <201708081218.v78CIBvL068413@repo.freebsd.org>
 <20170808225104.I3528@besplex.bde.org>
 <20170808161507.2d185d394dc921eac6729d89@bidouilliste.com>
 <20170809171526.B1567@besplex.bde.org>
X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; amd64-portbld-freebsd12.0)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
 user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all/>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Aug 2017 09:40:52 -0000

On Wed, 9 Aug 2017 18:44:19 +1000 (EST)
Bruce Evans <brde@optusnet.com.au> wrote:

> On Tue, 8 Aug 2017, Emmanuel Vadot wrote:
> 
> > On Tue, 8 Aug 2017 23:55:52 +1000 (EST)
> > Bruce Evans <brde@optusnet.com.au> wrote:
> >
> >> On Tue, 8 Aug 2017, Emmanuel Vadot wrote:
> >>
> >>> Log:
> >>>  vmstat: Always emit a space after the free-memory column
> >>>
> >>>  When displaying in non-human form, if the free-memory number
> >>>  is large (more than 7 digits), there is no space between it and
> >>>  the page fault column.
> >>>
> >>>  PR:		221290
> >>>  Submitted by:	Josuah Demangeon <mail@josuah.net> (Original version)
> >>>
> >>> Modified:
> >>>  head/usr.bin/vmstat/vmstat.c
> >>>
> >>> Modified: head/usr.bin/vmstat/vmstat.c
> >>> ==============================================================================
> >>> --- head/usr.bin/vmstat/vmstat.c	Tue Aug  8 11:49:36 2017	(r322251)
> >>> +++ head/usr.bin/vmstat/vmstat.c	Tue Aug  8 12:18:11 2017	(r322252)
> >>> @@ -832,6 +832,7 @@ dovmstat(unsigned int interval, int reps)
> >>> 			xo_emit(" ");
> >>> 			xo_emit("{:free-memory/%7d}",
> >>> 			        vmstat_pgtok(total.t_free));
> >>> +			xo_emit(" ");
> >>> 		}
> >>> 		xo_emit("{:total-page-faults/%5lu} ",
> >>> 		        (unsigned long)rate(sum.v_vm_faults -
> >>
> >> This seems to break the formatting.  There was a negative amount of space
> >> available for expansion, and since the header was not expanded to match
> >> its alignment with the fields is more random than before.  With -h, the
> >> width was 80 columns, giving ugly line wrap on 80-column terminals with
> >> auto-wrap.  Now it is 81 columns, giving uglier line wrap on all 80-
> >> column terminals.
> >
> > This break nothing, This was the case before too (with or without
> > -h), just tested in tmux with force-width 80.
> 
> Testing shows that it breaks the -H case as expected.  (I had misread the
> change as affecting the -h case too.)
> 
> Before:
> 
> X procs     memory        page                    disks     faults         cpu
> X r b w     avm     fre  flt  re  pi  po    fr   sr ad0 ad1   in    sy    cs us sy id
> X 0 0 0 12719376  977460 2167   3  28   0  2545  451   0   0  268 27162  7175  9  1 91
> 
> Misformatting in the data line starts with the avm field taking 8 of the 7
> columns reserved for it.  This and all subsequent fields are misaligned with
> the header.
> 
> The header is pre-misformatted to width 85, with lots of inconsistencies in
> the first line.  Old versions were carefully formatted to width 79, with no
> inconsistencies in the first line.  The extra width for avm makes the extra
> width for the header not even work to line up.
> 
> X procs     memory        page                    disks     faults         cpu
> X r b w     avm     fre  flt  re  pi  po    fr   sr ad0 ad1   in    sy    cs us sy id
> X 0 0 0 12719380  977460  2167   3  28   0  2545  451   0   0  268 27162  7175  9  1 91
> 
> The new bugs are that the extra space before the "flt" field increases the
> off-by-1 error to off-by-2 for this field and all subsequent fields.  All
> fields starting with "flt" now have an error of at least off-by-1 even
> if none are too wide, since the header was not expanded to match.  Expansion
> would increase its breakage from 6 to wide to 7 too wide.

 Thanks, I know see the problem.
 I'll see if this could be corrected easily with libxo ...

> >> The bugs were mostly in the first line of the header:
> >> - the second line of the header was correct for vmstat -h
> >> - for vmstat without -h, the second line of the header was apparently broken
> >>    by a change like the one here, that added a space after the "r b w" fields
> >>    without adding one in the "r b w" header
> >> - most of the fields in the first line of the header are misaligned with the
> >>    second lone.  Many have drifted 3 to the left of where the were in a sort
> >>    of center-justified place.  Some of these might have actually been
> >>    intended to be left justified, but had an off by +1 error.  Now these
> >>    have an error of off by -2 relative to left justifications.
> >>
> >> Only the "memory" header in the first line is better than in old versions.
> >> It is now left justified.  Left justifying all headers in the first line
> >> is probably best.  I couldn't find a good way to delimit the right hand
> >> side of the extents of the headers in the first line.  The second line of
> >> the headers already uses right justification consistently and this works
> >> well.
> >
> > I think that all this might be true but you might talk about the whole
> > libxo conversion that was done, not my commit right ?
> 
> I try not to look at libxo, since its misformattings are enormous starting
> in the source code.
> 
> My columnize.awk script works almost perfectly to fix the above
> misformattings:
> 
> X BEGIN {
> X 	columns = 79			# XXX
> X }
> X 
> X {
> X 	# Determine fields and field widths for current line.
> X 	# awk's field splitting feature turned out to be inadequate,
> X 	# and this would be even easier in C.
> X 	n = split($0, a, "")
> X 	f = 1
> X 	j = 1
> X 	while (j <= n) {
> X 		cpw[f] = 0
> X 		cfw[f] = 0
> X 		while (j <= n && a[j] == " ") {
> X 			cpw[f]++	# current (left) padding width
> X 			j++
> X 		}
> X 		if (j > n)
> X 			break
> X 		while (j <= n && a[j] != " ") {
> X 			cfw[f]++	# current field width without padding
> X 			j++
> X 		}
> X 		fld[f] = substr($0, j - cfw[f], cfw[f])
> X 		if (f != 1) {
> X 			cpw[f]--
> X 			cfw[f]++	# current field width with min padding
> X 		}
> X 		f++
> X 	}
> X 	nf = f - 1			# no need to use or trust NF
> X 
> X 	if (NR == 1 || NR < 3 && nf != anf) {
> X 		# Make current field widths (with full padding) the maximum
> X 		# ones.  When NR > 1, this discards the previous maximums.
> X 		# Good enough for vmstat, where the widths from NR == 1 are
> X 		# garbage.
> X 		#
> X 		# Make current field widths (with minimal padding) the minimum
> X 		# ones.
> X 		anf = nf
> X 		for (f = 1; f <= anf; f++) {
> X 			Mfw[f] = cpw[f] + cfw[f]
> X 			mfw[f] = cfw[f]
> X 		}
> X 	} else if (nf < anf) {
> X 		# Some non-tabular line after warming up.  Probably an ornate
> X 		# line in the next header.  Too hard to handle properly.
> X 		printf("%.*s oops\n", columns - 5, $0)
> X 		fflush(stdout)
> X 		next
> X 	} else if (nf > anf) {
> X 		# Some non-tabular line after warming up.  Hopefully just
> X 		# a single trailing field with inadequate field separators
> X 		# like the COMMAND field in ps and top.  Append the extras
> X 		# to the last field with minimal separators.
> X 		for (f = anf + 1; f <= nf; f++)
> X 			fld[anf] = fld[anf] " " fld[f]
> X 		nf = anf
> X 		cfw[nf] = 1 + length(fld[nf])
> X 	}
> X 
> X 	# Update the maximum and minimum field widths if this line needs
> X 	# wider fields.
> X 	len = 0
> X 	for (f = 1; f <= nf; f++) {
> X 		if (Mfw[f] < cfw[f])
> X 			Mfw[f] = cfw[f]
> X 		len += Mfw[f]
> X 		if (mfw[f] < cfw[f])
> X 			mfw[f] = cfw[f]
> X 	}
> X 
> X 	# If the line would be too wide, reduce the maximum field widths
> X 	# minimally towards minimum ones.
> X 	while (len > columns) {
> X 		for (f = 1; f <= nf; f++) {
> X 			if (Mfw[f] > mfw[f]) {
> X 				Mfw[f]--
> X 				len--
> X 				break
> X 			}
> X 		}
> X 		if (f > nf)
> X 			break
> X 	}
> X 
> X 	s = ""
> X 	for (f = 1; f <= nf - 1; f++)
> X 		s = s sprintf("%*s", Mfw[f], fld[f])
> X 	s = s sprintf(" %-*s", Mfw[nf] - 1, fld[nf])
> X 	printf("%.*s\n", columns, s)
> X 	fflush(stdout)
> X }
> 
> When applied to the above saved output (old then new, quoted by X's), it
> produces:
> 
> X procs     memory        page                    disks     faults cpu 
> X r b w avm   fre  flt  re  pi  po    fr   sr ad0 ad1   in    sy    cs us sy id
> X 0 0 0 12719376 977460 2167  3 28  0 2545 451   0   0 268 27162  7175  9  1 91
> X procs     memory        page                    disks     faults         oops
> X r b w      avm    fre  flt re pi po   fr  sr ad0 ad1  in    sy    cs us sy id
> X 0 0 0 12719380 977460 2167  3 28  0 2545 451   0   0 268 27162  7175  9  1 91
> 
> It removed enough spaces to fit in 79 columns even with the X's.  It can't
> handle the first line of the header very well since this line has an ornate
> format whose columns don't match the rest of the output in any simple way
> (programs should produce more parseable formats, like ps does).  All that
> that it did for the first line was move "cpu" to a less worse place in the
> old output and change it to "oops" in the new output.
> 
> Even programs like ps and df that attempt to do the necessary dynamic
> formatting often produce worse results than the above script.  df on
> freefall now produces a mess lots of spaces giving long lines even with
> -h, I think because the mountpoint names are very long and df has hard-
> coded formats for everything else.  columnize.awk only manages to almost
> clean this up for the -h case.  It has a problem that it is stream-based
> and doesn't backtrack to adjust early lines.  This is necessary for use
> in filters.
> 
> vmstat 1 | awk -f columize.awk is a good example of using columize.awk
> in a filter.  This use also requires it to understand headers not in
> the first few lines of output.  This is not easy, and in practice it
> gets confused by the first line of the vmstat header unless it is the
> first line of the output.  This is why it printed "oops" in the
> abov concatenated output.  The first header line just has an unusual
> number of fields so cannot be handled right.  When it is the first
> line in the output, it is printed verbatim except possibly for removing
> spaces, since it gives the only available information on the expected
> columnar output.
> 
> Bruce


-- 
Emmanuel Vadot <manu@bidouilliste.com> <manu@freebsd.org>