Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 10 Jul 2020 16:44:00 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Yuri Pankov <yuripv@yuripv.dev>
Cc:        Steve Wills <swills@FreeBSD.org>, "daichi@freebsd.org" <daichi@FreeBSD.org>, FreeBSD Current <freebsd-current@freebsd.org>, Hiroki Sato <hrs@FreeBSD.org>
Subject:   Re: svn commit: r352558 - head/usr.bin/top
Message-ID:  <A6DA92D1-C98B-416F-B1FA-315AF943426D@yahoo.com>
In-Reply-To: <BE3A2B48-D593-4733-8EAC-4C70F3F0B9B4@yahoo.com>
References:  <1BDFB387-930D-4F4D-8729-A5850F1C15B9.ref@yahoo.com> <1BDFB387-930D-4F4D-8729-A5850F1C15B9@yahoo.com> <61107ecc-6f9b-a4db-7b1e-ec75f73939ee@FreeBSD.org> <f8c8e434-39d7-4c7b-d33d-ef8a6b196eb9@yuripv.dev> <BE3A2B48-D593-4733-8EAC-4C70F3F0B9B4@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Jul-10, at 15:25, Mark Millard <marklmi at yahoo.com> wrote:

> On 2020-Jul-10, at 11:05, Yuri Pankov <yuripv at yuripv.dev> wrote:
>=20
>> Steve Wills wrote:
>>> On 11/28/19 4:08 PM, Mark Millard via svn-src-head wrote:
>>>>> Author: daichi
>>>>> Date: Fri Sep 20 17:37:23 2019
>>>>> New Revision: 352558
>>>>> URL:
>>>>> https://svnweb.freebsd.org/changeset/base/352558
>>>>>=20
>>>>>=20
>>>>> Log:
>>>>>   top(1): support multibyte characters in command names (ARGV =
array)
>>>>>   depending on locale.
>>>>>    - add setlocale()
>>>>>    - remove printable() function
>>>>>    - add VIS_OCTAL and VIS_SAFE to the flag of strvisx() to =
display
>>>>>      non-printable characters that do not use C-style backslash =
sequences
>>>>>      in three digit octal sequence, or remove it
>>>>>   This change allows multibyte characters to be displayed =
according to
>>>>>   locale. If it is recognized as a non-display character according =
to the
>>>>>   locale, it is displayed in three digit octal sequence.
>>>>>=20
>>>>=20
>>>> Initially picking on tab characters as an example of what is
>>>> probably a somewhat broader issue . . .
>>>>=20
>>>> Ever since this change, characters like tabs that do not fit
>>>> in the next character cell when output, but for which they
>>>> are !isprintable(...), now mess up the top display. Again
>>>> using tab as an example: line wrapping from the text having
>>>> been shifted over by more than one character cell. top does
>>>> not track the line wrapping result in how it decides what
>>>> to output for the following display updates.
>>>>=20
>>> Apologies for the way late reply here, but I just now bothered =
tracking this down. This commit seems to be the cause of some corruption =
I'm seeing in long running top(1) as well. As Mark mentions, if I use =
"hh" it clears up. Should I open a bugzilla bug? I can share screenshots =
of the corruption, such as:
>>> https://i.imgur.com/Xqlwf9h.png
>>> https://i.imgur.com/Jv0d5NU.png
>>=20
>> Does removing VIS_SAFE fixes the issue for you?
>>=20
>> As for original Mark's report (which I missed), removing =
isprintable() doesn't look wrong as vis(3) should take of its =
functionality (and in multibyte-aware way).
>=20
> vis (as used) and the old isprintable logic are not
> equivalent when multi-byte is not needed/involved.
> Otherwise I'd not have had anything to ever report.
> If vis can do what is needed, more work needed to
> be done when the change was made in order to avoid
> msesed up displays in single-byte contexts.
>=20
>> Also, is there an easy way to reproduce this?
>=20
> The following sort of command (the empty space inside quoted
> text are tab characters):
>=20
> # tr '0\n      1\n     2\n     3\n     4\n     5\n     6\n     7\n     =
8\n' '\t0       \t1     \t2     \t3     \t4     \t5     \t6     \t7     =
\t8' < /dev/zero > /dev/null
>=20
> causes my 200 character wide window running top to show:
>=20
> 32920 root        100    0  12764Ki    2420Ki CPU3     3   2:22  =
99.87% tr 0\\n	1\\n	2\\n	3\\n	4\\n	5\\n	6\\n	7\\n	=
8\\n \\t0	\\t1	\\t2	\\t3	\\t4	\\t5	\\t6	\\t733   =
\\t8       20        7172      5448Ki CPU23   23   0:00 0.04% top =
-HiSCazopid
>=20
> But that does not show where the lines wrap at the edges of the =
window,
> so breaking it up explicitly after the first "\" in \\7:
>=20
> 32920 root        100    0  12764Ki    2420Ki CPU3     3   2:22  =
99.87% tr 0\\n	1\\n	2\\n	3\\n	4\\n	5\\n	6\\n	7\\n	=
8\\n \\t0	\\t1	\\t2	\\t3	\\t4	\\t5	\\t6	\
> \t733   \\t8       20        7172      5448Ki CPU23   23   0:00   =
0.04% top -HiSCazopid
>=20
> Note how \n turned into \\n , taking an extra character for
> each \n . Similarly for \t vs. \\t . (Other examples do
> similarly.)
>=20
> The tab characters really do use more than one character cell
> on the display (sometimes).
>=20
> The text from the tr command ends up spread across 2 lines
> as things look like in the window where top is running.
>=20
> I ran top in another ssh session first and then the tr command.
> Before running the tr command, top showed as:
>=20
> 33019 root         20    0  17172Ki    5448Ki CPU24   24   0:00   =
0.05% top -HiSCazopid
>=20
> If you do not end up with top listed just after tr in top's output,
> then it will not be top's line that ends up partially overwritten.
>=20
> If you have wider windows, you may need more text in the tr quoted
> strings.
>=20
> In another experiment I inserted a large number of backspace =
characters
> (control-H's) at the front of the first quoted string in the tr =
command.
> The top output displayed:
>=20
> 0\\n5 ro1\\n    2\\93   3\\n12764\\n   25\\ni CP6\\n   97\\n:12 =
100.00\t0r \nHiS\\t1pid \\t2	\\t3	\\t4	\\t5	\\t6	\\t
> 33094 root         20    0  17172Ki    5488Ki CPU21   21   0:00   =
0.06% top -HiSCazopid
>=20
> In other words, backspace moved the cursor position back over prior
> fields on the line and then the later line content overwrote those
> fields instead of being after "tr" someplace (or truncated off).
>=20
> Note that part of "-HiSCazopid" shows up on both lines. The extra
> is from when top was running but tr had not started yet. top is
> not managing text replacement correctly for output characters that
> end up not being just "in" the next character-cell on the terminal.
>=20
> The same sort of result happens when instead adding just one
> carriage return (control-M) in front of that first quuoted
> string instead:
>=20
> 0\\n8 ro1\\n    2\\92   3\\n12764\\n   25\\ni CP6\\n  117\\n:11 =
100.00\t0r \nHiS\\t1pid \\t2	\\t3	\\t4	\\t5	\\t6	\\t
> 33094 root         20    0  17172Ki    5488Ki CPU23   23   0:00   =
0.04% top -HiSCazopid
>=20
> I do not intend to try to find all examples of characters that
> cause problems but used to not cause problems.
>=20
> =46rom what I've seen, cursor positioning escape character sequences
> seem to be sent through and cause overwrites at arbitrary places
> on screen, based on the escape sequence content. There are command
> lines around that contain such sequences. So I sometimes see the
> first few lines of top's output have garbage text from commands
> that were listed below at some point overwriting the top text.
>=20
> Part of what is going on is top avoiding rewriting characters
> that its tracking indicates have not been updated. When the
> actual display and that supposed-tracking mismatch, the
> display ends up wrong when updated (bad text continues to
> display).
>=20
> The text in commands should not make "top -a" output mess up
> the display of other lines in top's output, nor of other
> top output fields on the same line. In my view, if some usage
> contexts need otherwise, it should take an extra command line
> option to put top in a mode that might do such things. The
> default behavior should strictly avoid having such things
> happen.

I accidentally had a ^M instead of a ^H (backspace) at
one position in my backspace example. Removing that
carriage return produced:

33345 root         98    0  12764Ki  0\\n20Ki CP1\\n   82\\n:20  99.99% =
4\\n-HiS5\\npid 6\\n	7\\n	8\\n \\t0	\\t1	\\t2	\\t3	=
\\t4	\\t5	\\t6	\\t7	\
33094 root         20    0  17172Ki    5620Ki CPU24   24   0:02   0.04% =
top -HiSCazopid

And I tried:

# tr =
'\a\b\c\d\e\f\g\h\i\j\k\l\m\n\o\p\q\r\s\t\u\v\w\x\y\z\0\1\2\3\4\5\6\7\8\9'=
 =
'\9\8\7\6\5\4\3\2\1\0\q\w\e\r\t\y\u\i\o\pa\s\d\f\g\h\j\jk\l\z\x\c\v\b\n\m'=
 < /dev/zero > /dev/null

This seems to have worked as desired: truncating the tar -a output for
the tr process the end of the tar line, with: \\7\\6\\5

No other lines were messed up.

I also tried just control-K (Vertical Tab) sequences and it resulted in:

33417 root        103    0  12764Ki    2420Ki CPU10   10   1:03  99.96% =
tr \v\v\v\v\v\v\v\v\v \v\v\v\v\v\v\v\v\v

Similarly control-L (Form Feed) characters resulted in:

33429 root        102    0  12764Ki    2420Ki CPU12   12   0:31  99.99% =
tr \f\f\f\f\f\f\f\f \f\f\f\f\f\f\f

Similarly DEL characters resulted in:

33456 root         93    0  12764Ki    2420Ki CPU15   15   0:12  99.99% =
tr =
\177\177\177\177\177\177\177\177\177\177\177\177\177\177\177\177\177\177\1=
77\177\177\177\177\177\177\177 \177\177\177\177\17

(truncating at the end of top display line).

QUOTE
    VIS_SAFE	 Only encode "unsafe" characters.  Unsafe means	control	=
char-
		 acters	which may cause	common terminals to perform =
unexpected
		 functions.  Currently this form allows	space, tab, =
newline,
		 backspace, bell, and return --	in addition to all =
graphic
		 characters -- unencoded.
END QUOTE

If VIS_SAFE is avoided, it might be that space would end up encoded =
instead
of unencoded. that would not be good.

But it does seem that encoding or replacing tab, newline, backspace, =
bell,
and return would be more appropriate than what now happens.


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A6DA92D1-C98B-416F-B1FA-315AF943426D>