Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 3 Nov 2023 07:23:02 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        garyj@gmx.de
Cc:        Christos Margiolis <christos@freebsd.org>,  "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>, bojan.novkovic@fer.hr,  Warner Losh <imp@freebsd.org>
Subject:   Re: HEADS UP: IUTF8 to be enabled by default
Message-ID:  <CANCZdfpxxyeqDWQwc133HcATRasFJRp-c7N=OotMs4DB3QdkFQ@mail.gmail.com>
In-Reply-To: <20231103081529.016be29d@ernst.home>
References:  <lrxccjlnihx5pke4hrufgebxmgrrbmlbd246o55phhzyhqlhfp@yxvipsuagrdc> <20231103081529.016be29d@ernst.home>

next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000001d05db06093f6892
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Fri, Nov 3, 2023, 1:15 AM Gary Jennejohn <garyj@gmx.de> wrote:

> On Thu, 2 Nov 2023 21:43:32 +0200
> Christos Margiolis <christos@freebsd.org> wrote:
>
> > Hello again and sorry for the poorly worded previous email,
> >
> > To give a bit more context, during EuroBSDCon 2023, me and Bojan
> > Novkovi? started working on a patch to fix backspacing of UTF-8
> > characters in the tty driver. What was happening is if you typed a >1
> > byte UTF-8 character and then backspaced it, the driver would actually
> > delete only 1 byte from the character, instead of all its bytes, which
> > ended up leaving garbage in the buffer since the character wasn't fully
> > deleted. To test this, run cat(1), type a UTF-8 character (e.g =C3=A9, =
=C3=A8, =C3=A0,
> > non-latin characters, etc), press backspace only once, and look at the
> > output:
> >
> > $ cat
> > ??<backspace>
> > ??
> >
> > Bojan then implemented a new IUTF8 flag for stty [1], which enables
> > proper handling for UTF-8 backspacing in the tty driver [2].
> >
> > In the Phabricator review of the tty(4) patch [3], I proposed the idea
> > of having the IUTF8 flag enabled by default. imp@ mentioned that since
> > the default locale is UTF-8, having the flag set by default shouldn't b=
e
> > a problem.
> >
> > Two possible solutions I have thought of:
> >
> > 1. Add IUTF8 to TTYDEF_IFLAG in sys/sys/ttydefaults.h.
> > 2. Add a check in tty_init_termios() whether the current locale is
> >    UTF-8 (how?), and enable it there.
> >
>
> Use getenv("LANG") and check whether UTF-8 is part of the string?
>

This string is set too late for the default. Also, drivers don't have
access to process data.

Warner

My LANG is set to C.UTF-8, for example.
>
> > What do you think? Could this change cause any side-effects we haven't
> > thought about?
> >
> > Christos
> >
> > [1]
> https://cgit.freebsd.org/src/commit/?id=3D128f63cedc14ae21b35f74e11e2fe1a=
5659c58e8
> > [2]
> https://cgit.freebsd.org/src/commit/?id=3D9e589b0938579f3f4d89fa5c051f845=
bf754184d
> > [3] https://reviews.freebsd.org/D42067
> >
>
>
> --
> Gary Jennejohn
>

--0000000000001d05db06093f6892
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"auto"><div><br><br><div class=3D"gmail_quote"><div dir=3D"ltr" =
class=3D"gmail_attr">On Fri, Nov 3, 2023, 1:15 AM Gary Jennejohn &lt;<a hre=
f=3D"mailto:garyj@gmx.de">garyj@gmx.de</a>&gt; wrote:<br></div><blockquote =
class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid=
;padding-left:1ex">On Thu, 2 Nov 2023 21:43:32 +0200<br>
Christos Margiolis &lt;<a href=3D"mailto:christos@freebsd.org" target=3D"_b=
lank" rel=3D"noreferrer">christos@freebsd.org</a>&gt; wrote:<br>
<br>
&gt; Hello again and sorry for the poorly worded previous email,<br>
&gt;<br>
&gt; To give a bit more context, during EuroBSDCon 2023, me and Bojan<br>
&gt; Novkovi? started working on a patch to fix backspacing of UTF-8<br>
&gt; characters in the tty driver. What was happening is if you typed a &gt=
;1<br>
&gt; byte UTF-8 character and then backspaced it, the driver would actually=
<br>
&gt; delete only 1 byte from the character, instead of all its bytes, which=
<br>
&gt; ended up leaving garbage in the buffer since the character wasn&#39;t =
fully<br>
&gt; deleted. To test this, run cat(1), type a UTF-8 character (e.g =C3=A9,=
 =C3=A8, =C3=A0,<br>
&gt; non-latin characters, etc), press backspace only once, and look at the=
<br>
&gt; output:<br>
&gt;<br>
&gt; $ cat<br>
&gt; ??&lt;backspace&gt;<br>
&gt; ??<br>
&gt;<br>
&gt; Bojan then implemented a new IUTF8 flag for stty [1], which enables<br=
>
&gt; proper handling for UTF-8 backspacing in the tty driver [2].<br>
&gt;<br>
&gt; In the Phabricator review of the tty(4) patch [3], I proposed the idea=
<br>
&gt; of having the IUTF8 flag enabled by default. imp@ mentioned that since=
<br>
&gt; the default locale is UTF-8, having the flag set by default shouldn&#3=
9;t be<br>
&gt; a problem.<br>
&gt;<br>
&gt; Two possible solutions I have thought of:<br>
&gt;<br>
&gt; 1. Add IUTF8 to TTYDEF_IFLAG in sys/sys/ttydefaults.h.<br>
&gt; 2. Add a check in tty_init_termios() whether the current locale is<br>
&gt;=C2=A0 =C2=A0 UTF-8 (how?), and enable it there.<br>
&gt;<br>
<br>
Use getenv(&quot;LANG&quot;) and check whether UTF-8 is part of the string?=
<br></blockquote></div></div><div dir=3D"auto"><br></div><div dir=3D"auto">=
This string is set too late for the default. Also, drivers don&#39;t have a=
ccess to process data.</div><div dir=3D"auto"><br></div><div dir=3D"auto">W=
arner</div><div dir=3D"auto"><br></div><div dir=3D"auto"><div class=3D"gmai=
l_quote"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex">
My LANG is set to C.UTF-8, for example.<br>
<br>
&gt; What do you think? Could this change cause any side-effects we haven&#=
39;t<br>
&gt; thought about?<br>
&gt;<br>
&gt; Christos<br>
&gt;<br>
&gt; [1] <a href=3D"https://cgit.freebsd.org/src/commit/?id=3D128f63cedc14a=
e21b35f74e11e2fe1a5659c58e8" rel=3D"noreferrer noreferrer" target=3D"_blank=
">https://cgit.freebsd.org/src/commit/?id=3D128f63cedc14ae21b35f74e11e2fe1a=
5659c58e8</a><br>
&gt; [2] <a href=3D"https://cgit.freebsd.org/src/commit/?id=3D9e589b0938579=
f3f4d89fa5c051f845bf754184d" rel=3D"noreferrer noreferrer" target=3D"_blank=
">https://cgit.freebsd.org/src/commit/?id=3D9e589b0938579f3f4d89fa5c051f845=
bf754184d</a><br>
&gt; [3] <a href=3D"https://reviews.freebsd.org/D42067" rel=3D"noreferrer n=
oreferrer" target=3D"_blank">https://reviews.freebsd.org/D42067</a><br>;
&gt;<br>
<br>
<br>
--<br>
Gary Jennejohn<br>
</blockquote></div></div></div>

--0000000000001d05db06093f6892--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfpxxyeqDWQwc133HcATRasFJRp-c7N=OotMs4DB3QdkFQ>