Date: Fri, 10 Jul 2020 15:25:06 -0700 From: Mark Millard <marklmi@yahoo.com> To: Yuri Pankov <yuripv@yuripv.dev> Cc: Steve Wills <swills@FreeBSD.org>, "daichi@freebsd.org" <daichi@FreeBSD.org>, FreeBSD Current <freebsd-current@freebsd.org>, Hiroki Sato <hrs@FreeBSD.org> Subject: Re: svn commit: r352558 - head/usr.bin/top Message-ID: <BE3A2B48-D593-4733-8EAC-4C70F3F0B9B4@yahoo.com> In-Reply-To: <f8c8e434-39d7-4c7b-d33d-ef8a6b196eb9@yuripv.dev> References: <1BDFB387-930D-4F4D-8729-A5850F1C15B9.ref@yahoo.com> <1BDFB387-930D-4F4D-8729-A5850F1C15B9@yahoo.com> <61107ecc-6f9b-a4db-7b1e-ec75f73939ee@FreeBSD.org> <f8c8e434-39d7-4c7b-d33d-ef8a6b196eb9@yuripv.dev>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Jul-10, at 11:05, Yuri Pankov <yuripv at yuripv.dev> wrote: > Steve Wills wrote: >> On 11/28/19 4:08 PM, Mark Millard via svn-src-head wrote: >>>> Author: daichi >>>> Date: Fri Sep 20 17:37:23 2019 >>>> New Revision: 352558 >>>> URL: >>>> https://svnweb.freebsd.org/changeset/base/352558 >>>>=20 >>>>=20 >>>> Log: >>>> top(1): support multibyte characters in command names (ARGV = array) >>>> depending on locale. >>>> - add setlocale() >>>> - remove printable() function >>>> - add VIS_OCTAL and VIS_SAFE to the flag of strvisx() to = display >>>> non-printable characters that do not use C-style backslash = sequences >>>> in three digit octal sequence, or remove it >>>> This change allows multibyte characters to be displayed = according to >>>> locale. If it is recognized as a non-display character according = to the >>>> locale, it is displayed in three digit octal sequence. >>>>=20 >>>=20 >>> Initially picking on tab characters as an example of what is >>> probably a somewhat broader issue . . . >>>=20 >>> Ever since this change, characters like tabs that do not fit >>> in the next character cell when output, but for which they >>> are !isprintable(...), now mess up the top display. Again >>> using tab as an example: line wrapping from the text having >>> been shifted over by more than one character cell. top does >>> not track the line wrapping result in how it decides what >>> to output for the following display updates. >>>=20 >> Apologies for the way late reply here, but I just now bothered = tracking this down. This commit seems to be the cause of some corruption = I'm seeing in long running top(1) as well. As Mark mentions, if I use = "hh" it clears up. Should I open a bugzilla bug? I can share screenshots = of the corruption, such as: >> https://i.imgur.com/Xqlwf9h.png >> https://i.imgur.com/Jv0d5NU.png >=20 > Does removing VIS_SAFE fixes the issue for you? >=20 > As for original Mark's report (which I missed), removing isprintable() = doesn't look wrong as vis(3) should take of its functionality (and in = multibyte-aware way). vis (as used) and the old isprintable logic are not equivalent when multi-byte is not needed/involved. Otherwise I'd not have had anything to ever report. If vis can do what is needed, more work needed to be done when the change was made in order to avoid msesed up displays in single-byte contexts. > Also, is there an easy way to reproduce this? The following sort of command (the empty space inside quoted text are tab characters): # tr '0\n 1\n 2\n 3\n 4\n 5\n 6\n 7\n = 8\n' '\t0 \t1 \t2 \t3 \t4 \t5 \t6 \t7 = \t8' < /dev/zero > /dev/null causes my 200 character wide window running top to show: 32920 root 100 0 12764Ki 2420Ki CPU3 3 2:22 99.87% = tr 0\\n 1\\n 2\\n 3\\n 4\\n 5\\n 6\\n 7\\n 8\\n = \\t0 \\t1 \\t2 \\t3 \\t4 \\t5 \\t6 \\t733 \\t8 = 20 7172 5448Ki CPU23 23 0:00 0.04% top -HiSCazopid But that does not show where the lines wrap at the edges of the window, so breaking it up explicitly after the first "\" in \\7: 32920 root 100 0 12764Ki 2420Ki CPU3 3 2:22 99.87% = tr 0\\n 1\\n 2\\n 3\\n 4\\n 5\\n 6\\n 7\\n 8\\n = \\t0 \\t1 \\t2 \\t3 \\t4 \\t5 \\t6 \ \t733 \\t8 20 7172 5448Ki CPU23 23 0:00 0.04% = top -HiSCazopid Note how \n turned into \\n , taking an extra character for each \n . Similarly for \t vs. \\t . (Other examples do similarly.) The tab characters really do use more than one character cell on the display (sometimes). The text from the tr command ends up spread across 2 lines as things look like in the window where top is running. I ran top in another ssh session first and then the tr command. Before running the tr command, top showed as: 33019 root 20 0 17172Ki 5448Ki CPU24 24 0:00 0.05% = top -HiSCazopid If you do not end up with top listed just after tr in top's output, then it will not be top's line that ends up partially overwritten. If you have wider windows, you may need more text in the tr quoted strings. In another experiment I inserted a large number of backspace characters (control-H's) at the front of the first quoted string in the tr command. The top output displayed: 0\\n5 ro1\\n 2\\93 3\\n12764\\n 25\\ni CP6\\n 97\\n:12 = 100.00\t0r \nHiS\\t1pid \\t2 \\t3 \\t4 \\t5 \\t6 \\t 33094 root 20 0 17172Ki 5488Ki CPU21 21 0:00 0.06% = top -HiSCazopid In other words, backspace moved the cursor position back over prior fields on the line and then the later line content overwrote those fields instead of being after "tr" someplace (or truncated off). Note that part of "-HiSCazopid" shows up on both lines. The extra is from when top was running but tr had not started yet. top is not managing text replacement correctly for output characters that end up not being just "in" the next character-cell on the terminal. The same sort of result happens when instead adding just one carriage return (control-M) in front of that first quuoted string instead: 0\\n8 ro1\\n 2\\92 3\\n12764\\n 25\\ni CP6\\n 117\\n:11 = 100.00\t0r \nHiS\\t1pid \\t2 \\t3 \\t4 \\t5 \\t6 \\t 33094 root 20 0 17172Ki 5488Ki CPU23 23 0:00 0.04% = top -HiSCazopid I do not intend to try to find all examples of characters that cause problems but used to not cause problems. =46rom what I've seen, cursor positioning escape character sequences seem to be sent through and cause overwrites at arbitrary places on screen, based on the escape sequence content. There are command lines around that contain such sequences. So I sometimes see the first few lines of top's output have garbage text from commands that were listed below at some point overwriting the top text. Part of what is going on is top avoiding rewriting characters that its tracking indicates have not been updated. When the actual display and that supposed-tracking mismatch, the display ends up wrong when updated (bad text continues to display). The text in commands should not make "top -a" output mess up the display of other lines in top's output, nor of other top output fields on the same line. In my view, if some usage contexts need otherwise, it should take an extra command line option to put top in a mode that might do such things. The default behavior should strictly avoid having such things happen. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BE3A2B48-D593-4733-8EAC-4C70F3F0B9B4>