Date: Sat, 11 Jul 2020 03:48:10 +0300 From: Yuri Pankov <yuripv@yuripv.dev> To: Mark Millard <marklmi@yahoo.com> Cc: Steve Wills <swills@FreeBSD.org>, "daichi@freebsd.org" <daichi@FreeBSD.org>, FreeBSD Current <freebsd-current@freebsd.org>, Hiroki Sato <hrs@FreeBSD.org> Subject: Re: svn commit: r352558 - head/usr.bin/top Message-ID: <a0b9b38b-d9d1-62d8-e708-3d29ba524dc5@yuripv.dev> In-Reply-To: <CEE74E37-1C7F-4CA5-B12C-5BFB5E77027D@yahoo.com> References: <1BDFB387-930D-4F4D-8729-A5850F1C15B9.ref@yahoo.com> <1BDFB387-930D-4F4D-8729-A5850F1C15B9@yahoo.com> <61107ecc-6f9b-a4db-7b1e-ec75f73939ee@FreeBSD.org> <f8c8e434-39d7-4c7b-d33d-ef8a6b196eb9@yuripv.dev> <BE3A2B48-D593-4733-8EAC-4C70F3F0B9B4@yahoo.com> <d6b7193c-0cfe-a0c2-be94-e261b59b2dd1@yuripv.dev> <CEE74E37-1C7F-4CA5-B12C-5BFB5E77027D@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Mark Millard wrote: > > > On 2020-Jul-10, at 16:12, Yuri Pankov <yuripv at yuripv.dev> wrote: > >> Mark Millard wrote: >>> On 2020-Jul-10, at 11:05, Yuri Pankov <yuripv at yuripv.dev> wrote: >>>> Steve Wills wrote: >>>>> On 11/28/19 4:08 PM, Mark Millard via svn-src-head wrote: >>>>>>> Author: daichi >>>>>>> Date: Fri Sep 20 17:37:23 2019 >>>>>>> New Revision: 352558 >>>>>>> URL: >>>>>>> https://svnweb.freebsd.org/changeset/base/352558 >>>>>>> >>>>>>> >>>>>>> Log: >>>>>>> top(1): support multibyte characters in command names (ARGV array) >>>>>>> depending on locale. >>>>>>> - add setlocale() >>>>>>> - remove printable() function >>>>>>> - add VIS_OCTAL and VIS_SAFE to the flag of strvisx() to display >>>>>>> non-printable characters that do not use C-style backslash sequences >>>>>>> in three digit octal sequence, or remove it >>>>>>> This change allows multibyte characters to be displayed according to >>>>>>> locale. If it is recognized as a non-display character according to the >>>>>>> locale, it is displayed in three digit octal sequence. >>>>>>> >>>>>> >>>>>> Initially picking on tab characters as an example of what is >>>>>> probably a somewhat broader issue . . . >>>>>> >>>>>> Ever since this change, characters like tabs that do not fit >>>>>> in the next character cell when output, but for which they >>>>>> are !isprintable(...), now mess up the top display. Again >>>>>> using tab as an example: line wrapping from the text having >>>>>> been shifted over by more than one character cell. top does >>>>>> not track the line wrapping result in how it decides what >>>>>> to output for the following display updates. >>>>>> >>>>> Apologies for the way late reply here, but I just now bothered tracking this down. This commit seems to be the cause of some corruption I'm seeing in long running top(1) as well. As Mark mentions, if I use "hh" it clears up. Should I open a bugzilla bug? I can share screenshots of the corruption, such as: >>>>> https://i.imgur.com/Xqlwf9h.png >>>>> https://i.imgur.com/Jv0d5NU.png >>>> >>>> Does removing VIS_SAFE fixes the issue for you? >>>> >>>> As for original Mark's report (which I missed), removing isprintable() doesn't look wrong as vis(3) should take of its functionality (and in multibyte-aware way). >>> vis (as used) and the old isprintable logic are not >>> equivalent when multi-byte is not needed/involved. >>> Otherwise I'd not have had anything to ever report. >>> If vis can do what is needed, more work needed to >>> be done when the change was made in order to avoid >>> msesed up displays in single-byte contexts. >>>> Also, is there an easy way to reproduce this? >>> The following sort of command (the empty space inside quoted >>> text are tab characters): >>> # tr '0\n 1\n 2\n 3\n 4\n 5\n 6\n 7\n 8\n' '\t0 \t1 \t2 \t3 \t4 \t5 \t6 \t7 \t8' < /dev/zero > /dev/null >>> causes my 200 character wide window running top to show: >>> 32920 root 100 0 12764Ki 2420Ki CPU3 3 2:22 99.87% tr 0\\n 1\\n 2\\n 3\\n 4\\n 5\\n 6\\n 7\\n 8\\n \\t0 \\t1 \\t2 \\t3 \\t4 \\t5 \\t6 \\t733 \\t8 20 7172 5448Ki CPU23 23 0:00 0.04% top -HiSCazopid >>> But that does not show where the lines wrap at the edges of the window, >>> so breaking it up explicitly after the first "\" in \\7: >>> 32920 root 100 0 12764Ki 2420Ki CPU3 3 2:22 99.87% tr 0\\n 1\\n 2\\n 3\\n 4\\n 5\\n 6\\n 7\\n 8\\n \\t0 \\t1 \\t2 \\t3 \\t4 \\t5 \\t6 \ >>> \t733 \\t8 20 7172 5448Ki CPU23 23 0:00 0.04% top -HiSCazopid >>> Note how \n turned into \\n , taking an extra character for >>> each \n . Similarly for \t vs. \\t . (Other examples do >>> similarly.) >>> The tab characters really do use more than one character cell >>> on the display (sometimes). >>> The text from the tr command ends up spread across 2 lines >>> as things look like in the window where top is running. >>> I ran top in another ssh session first and then the tr command. >>> Before running the tr command, top showed as: >>> 33019 root 20 0 17172Ki 5448Ki CPU24 24 0:00 0.05% top -HiSCazopid >>> If you do not end up with top listed just after tr in top's output, >>> then it will not be top's line that ends up partially overwritten. >>> If you have wider windows, you may need more text in the tr quoted >>> strings. >>> In another experiment I inserted a large number of backspace characters >>> (control-H's) at the front of the first quoted string in the tr command. >>> The top output displayed: >>> 0\\n5 ro1\\n 2\\93 3\\n12764\\n 25\\ni CP6\\n 97\\n:12 100.00\t0r \nHiS\\t1pid \\t2 \\t3 \\t4 \\t5 \\t6 \\t >>> 33094 root 20 0 17172Ki 5488Ki CPU21 21 0:00 0.06% top -HiSCazopid >>> In other words, backspace moved the cursor position back over prior >>> fields on the line and then the later line content overwrote those >>> fields instead of being after "tr" someplace (or truncated off). >>> Note that part of "-HiSCazopid" shows up on both lines. The extra >>> is from when top was running but tr had not started yet. top is >>> not managing text replacement correctly for output characters that >>> end up not being just "in" the next character-cell on the terminal. >>> The same sort of result happens when instead adding just one >>> carriage return (control-M) in front of that first quuoted >>> string instead: >>> 0\\n8 ro1\\n 2\\92 3\\n12764\\n 25\\ni CP6\\n 117\\n:11 100.00\t0r \nHiS\\t1pid \\t2 \\t3 \\t4 \\t5 \\t6 \\t >>> 33094 root 20 0 17172Ki 5488Ki CPU23 23 0:00 0.04% top -HiSCazopid >>> I do not intend to try to find all examples of characters that >>> cause problems but used to not cause problems. >>> From what I've seen, cursor positioning escape character sequences >>> seem to be sent through and cause overwrites at arbitrary places >>> on screen, based on the escape sequence content. There are command >>> lines around that contain such sequences. So I sometimes see the >>> first few lines of top's output have garbage text from commands >>> that were listed below at some point overwriting the top text. >>> Part of what is going on is top avoiding rewriting characters >>> that its tracking indicates have not been updated. When the >>> actual display and that supposed-tracking mismatch, the >>> display ends up wrong when updated (bad text continues to >>> display). >>> The text in commands should not make "top -a" output mess up >>> the display of other lines in top's output, nor of other >>> top output fields on the same line. In my view, if some usage >>> contexts need otherwise, it should take an extra command line >>> option to put top in a mode that might do such things. The >>> default behavior should strictly avoid having such things >>> happen. >> >> Thanks. >> >> The attached diff seems to take care of the issue for me, adding VIS_TAB and removing VIS_SAFE, which can be blamed for passing through the following: >> >> VIS_SAFE Currently this form allows space, tab, newline, backspace, >> bell, and return — in addition to all graphic characters — >> unencoded. >> <top.txt> > > A quick test suggests agreement. We will see how it > looks for on-going use. > > But I'll note that top's man page should document the > translations that are being used: it is not the same > text that top produced before -r352558 and one should > be able to read the man page to find out how to > interpret what top reports for the likes of top -a . I think this was taken care of in r352568, and what it says now is correct: Non-printable characters in the command line are encoded in C-style backslash sequences or a three digit octal sequences. > (It does not appear that escape sequences or vertical > tab would have gone through unencoded. So I'm still > unclear how I ever had the top few lines of top's > output messed up by command text. So it is also > unclear that this change would make a difference > for such. We will see over time if that text is > ever messed up.)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a0b9b38b-d9d1-62d8-e708-3d29ba524dc5>