Date: Fri, 3 Nov 2023 07:23:02 -0600 From: Warner Losh <imp@bsdimp.com> To: garyj@gmx.de Cc: Christos Margiolis <christos@freebsd.org>, "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>, bojan.novkovic@fer.hr, Warner Losh <imp@freebsd.org> Subject: Re: HEADS UP: IUTF8 to be enabled by default Message-ID: <CANCZdfpxxyeqDWQwc133HcATRasFJRp-c7N=OotMs4DB3QdkFQ@mail.gmail.com> In-Reply-To: <20231103081529.016be29d@ernst.home> References: <lrxccjlnihx5pke4hrufgebxmgrrbmlbd246o55phhzyhqlhfp@yxvipsuagrdc> <20231103081529.016be29d@ernst.home>
next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000001d05db06093f6892 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Nov 3, 2023, 1:15 AM Gary Jennejohn <garyj@gmx.de> wrote: > On Thu, 2 Nov 2023 21:43:32 +0200 > Christos Margiolis <christos@freebsd.org> wrote: > > > Hello again and sorry for the poorly worded previous email, > > > > To give a bit more context, during EuroBSDCon 2023, me and Bojan > > Novkovi? started working on a patch to fix backspacing of UTF-8 > > characters in the tty driver. What was happening is if you typed a >1 > > byte UTF-8 character and then backspaced it, the driver would actually > > delete only 1 byte from the character, instead of all its bytes, which > > ended up leaving garbage in the buffer since the character wasn't fully > > deleted. To test this, run cat(1), type a UTF-8 character (e.g =C3=A9, = =C3=A8, =C3=A0, > > non-latin characters, etc), press backspace only once, and look at the > > output: > > > > $ cat > > ??<backspace> > > ?? > > > > Bojan then implemented a new IUTF8 flag for stty [1], which enables > > proper handling for UTF-8 backspacing in the tty driver [2]. > > > > In the Phabricator review of the tty(4) patch [3], I proposed the idea > > of having the IUTF8 flag enabled by default. imp@ mentioned that since > > the default locale is UTF-8, having the flag set by default shouldn't b= e > > a problem. > > > > Two possible solutions I have thought of: > > > > 1. Add IUTF8 to TTYDEF_IFLAG in sys/sys/ttydefaults.h. > > 2. Add a check in tty_init_termios() whether the current locale is > > UTF-8 (how?), and enable it there. > > > > Use getenv("LANG") and check whether UTF-8 is part of the string? > This string is set too late for the default. Also, drivers don't have access to process data. Warner My LANG is set to C.UTF-8, for example. > > > What do you think? Could this change cause any side-effects we haven't > > thought about? > > > > Christos > > > > [1] > https://cgit.freebsd.org/src/commit/?id=3D128f63cedc14ae21b35f74e11e2fe1a= 5659c58e8 > > [2] > https://cgit.freebsd.org/src/commit/?id=3D9e589b0938579f3f4d89fa5c051f845= bf754184d > > [3] https://reviews.freebsd.org/D42067 > > > > > -- > Gary Jennejohn > --0000000000001d05db06093f6892 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto"><div><br><br><div class=3D"gmail_quote"><div dir=3D"ltr" = class=3D"gmail_attr">On Fri, Nov 3, 2023, 1:15 AM Gary Jennejohn <<a hre= f=3D"mailto:garyj@gmx.de">garyj@gmx.de</a>> wrote:<br></div><blockquote = class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid= ;padding-left:1ex">On Thu, 2 Nov 2023 21:43:32 +0200<br> Christos Margiolis <<a href=3D"mailto:christos@freebsd.org" target=3D"_b= lank" rel=3D"noreferrer">christos@freebsd.org</a>> wrote:<br> <br> > Hello again and sorry for the poorly worded previous email,<br> ><br> > To give a bit more context, during EuroBSDCon 2023, me and Bojan<br> > Novkovi? started working on a patch to fix backspacing of UTF-8<br> > characters in the tty driver. What was happening is if you typed a >= ;1<br> > byte UTF-8 character and then backspaced it, the driver would actually= <br> > delete only 1 byte from the character, instead of all its bytes, which= <br> > ended up leaving garbage in the buffer since the character wasn't = fully<br> > deleted. To test this, run cat(1), type a UTF-8 character (e.g =C3=A9,= =C3=A8, =C3=A0,<br> > non-latin characters, etc), press backspace only once, and look at the= <br> > output:<br> ><br> > $ cat<br> > ??<backspace><br> > ??<br> ><br> > Bojan then implemented a new IUTF8 flag for stty [1], which enables<br= > > proper handling for UTF-8 backspacing in the tty driver [2].<br> ><br> > In the Phabricator review of the tty(4) patch [3], I proposed the idea= <br> > of having the IUTF8 flag enabled by default. imp@ mentioned that since= <br> > the default locale is UTF-8, having the flag set by default shouldn= 9;t be<br> > a problem.<br> ><br> > Two possible solutions I have thought of:<br> ><br> > 1. Add IUTF8 to TTYDEF_IFLAG in sys/sys/ttydefaults.h.<br> > 2. Add a check in tty_init_termios() whether the current locale is<br> >=C2=A0 =C2=A0 UTF-8 (how?), and enable it there.<br> ><br> <br> Use getenv("LANG") and check whether UTF-8 is part of the string?= <br></blockquote></div></div><div dir=3D"auto"><br></div><div dir=3D"auto">= This string is set too late for the default. Also, drivers don't have a= ccess to process data.</div><div dir=3D"auto"><br></div><div dir=3D"auto">W= arner</div><div dir=3D"auto"><br></div><div dir=3D"auto"><div class=3D"gmai= l_quote"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde= r-left:1px #ccc solid;padding-left:1ex"> My LANG is set to C.UTF-8, for example.<br> <br> > What do you think? Could this change cause any side-effects we haven&#= 39;t<br> > thought about?<br> ><br> > Christos<br> ><br> > [1] <a href=3D"https://cgit.freebsd.org/src/commit/?id=3D128f63cedc14a= e21b35f74e11e2fe1a5659c58e8" rel=3D"noreferrer noreferrer" target=3D"_blank= ">https://cgit.freebsd.org/src/commit/?id=3D128f63cedc14ae21b35f74e11e2fe1a= 5659c58e8</a><br> > [2] <a href=3D"https://cgit.freebsd.org/src/commit/?id=3D9e589b0938579= f3f4d89fa5c051f845bf754184d" rel=3D"noreferrer noreferrer" target=3D"_blank= ">https://cgit.freebsd.org/src/commit/?id=3D9e589b0938579f3f4d89fa5c051f845= bf754184d</a><br> > [3] <a href=3D"https://reviews.freebsd.org/D42067" rel=3D"noreferrer n= oreferrer" target=3D"_blank">https://reviews.freebsd.org/D42067</a><br> ><br> <br> <br> --<br> Gary Jennejohn<br> </blockquote></div></div></div> --0000000000001d05db06093f6892--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfpxxyeqDWQwc133HcATRasFJRp-c7N=OotMs4DB3QdkFQ>