Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Jan 2023 10:52:08 +0300
From:      Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com>
To:        Hans Petter Selasky <hps@selasky.org>
Cc:        Yuri <yuri@aetern.org>, current@freebsd.org
Subject:   Re: vt and keyboard accents
Message-ID:  <CAOgwaMv1GdXxsc_8n6NbtnnXe39w0V6JqZR4ro_pNtLttJKY2w@mail.gmail.com>
In-Reply-To: <c316280c-8dd8-b969-e623-9fcadab04dd1@selasky.org>
References:  <70f53d17-46eb-c299-1a93-bf28858c1685@aetern.org> <c316280c-8dd8-b969-e623-9fcadab04dd1@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000003cc82805f3626254
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sun, Jan 29, 2023 at 10:16 AM Hans Petter Selasky <hps@selasky.org>
wrote:

> On 1/29/23 01:54, Yuri wrote:
> > Looking into an issue with accents input for vt and cz (so
> > /usr/share/vt/keymaps/cz.kbd) keyboard where some of the accents are
> > working and other result weird unrelated characters output.
> >
> > Checking kbdcontrol -d output, there is an obvious difference with
> > keymap contents -- all mappings are trimmed down to 1 byte after readin=
g:
> >
> > kbdcontrol:
> >    dacu  180  ( 180 180 ) ( 'S' 'Z' ) ( 'Z' 'y' ) ( 's' '[' )
> >               ( 'z' 'z' ) ( 'R' 'T' ) ( 'A' 193 ) ( 'L' '9' )
> >               ( 'C' 006 ) ( 'E' 201 ) ( 'I' 205 ) ( 'N' 'C' )
> >               ( 'O' 211 ) ( 'U' 218 ) ( 'Y' 221 ) ( 'r' 'U' )
> >               ( 'a' 225 ) ( 'l' ':' ) ( 'c' 007 ) ( 'e' 233 )
> >               ( 'i' 237 ) ( 'n' 'D' ) ( 'o' 243 ) ( 'u' 250 )
> >               ( 'y' 253 )
> >
> > keymap:
> >    dacu 0xb4    ( 0xb4   0xb4    ) ( 'S'    0x015a  ) ( 'Z'    0x0179  =
)
> > ( 's'    0x015b  )
> >                 ( 'z'    0x017a  ) ( 'R'    0x0154  ) ( 'A'    0xc1    =
)
> > ( 'L'    0x0139  )
> >                 ( 'C'    0x0106  ) ( 'E'    0xc9    ) ( 'I'    0xcd    =
)
> > ( 'N'    0x0143  )
> >                 ( 'O'    0xd3    ) ( 'U'    0xda    ) ( 'Y'    0xdd    =
)
> > ( 'r'    0x0155  )
> >                 ( 'a'    0xe1    ) ( 'l'    0x013a  ) ( 'c'    0x0107  =
)
> > ( 'e'    0xe9    )
> >                 ( 'i'    0xed    ) ( 'n'    0x0144  ) ( 'o'    0xf3    =
)
> > ( 'u'    0xfa    )
> >                 ( 'y'    0xfd    )
> >
> > Source of the problem is the following definition in sys/sys/kbio.h:
> >
> > struct acc_t {
> >          u_char          accchar;
> >          u_char          map[NUM_ACCENTCHARS][2];
> > };
> >
> > While the keymaps were converted to have the unicode characters for vt
> > in the commit below, the array to store them (map) was missed, or was
> > there a reason for this?
> >
> > ---
> > commit 7ba08f814546ece02e0193edc12cf6eb4d5cb8d4
> > Author: Stefan E=C3=9Fer <se@FreeBSD.org>
> > Date:   Sun Aug 17 19:54:21 2014 +0000
> >
> >      Attempt at converting the SYSCONS keymaps to Unicode for use with
> > NEWCONS.
> >      I have spent many hours comparing source and destination formats,
> > and hope
> >      to have caught the most severe conversion errors.
> > ---
> >
> > I have tried the following patch and it allows me to enter all accents
> > documented in the keymap, though I must admit I'm not sure it does not
> > have hidden issues:
> >
> > diff --git a/sys/sys/kbio.h b/sys/sys/kbio.h
> > index 7f17bda76c5..fffeb63e226 100644
> > --- a/sys/sys/kbio.h
> > +++ b/sys/sys/kbio.h
> > @@ -200,7 +200,7 @@ typedef struct okeymap okeymap_t;
> >
> >   struct acc_t {
> >          u_char          accchar;
> > -       u_char          map[NUM_ACCENTCHARS][2];
> > +       int             map[NUM_ACCENTCHARS][2];
> >   };
> >
>
> Hi,
>
> Using "int" for unicode characters is probably good for now. Your patch
> looks good, but please also consider the "umlaut" case while at it
> (multiple characters that become one)!
>
> --HPS
>
>


I am not an expert on UNICODE .

When character sets are considered , a homogeneous definition for all of
the FreeBSD system would be more useful .
There are mainly three types of Unicode : UTF-8 , UTF-16 , and UTF-32 where
numbers are bit sizes of the characters .


Some pages about Unicode where they have many linked sub pages :



https://en.wikipedia.org/wiki/Category:Unicode
Category:Unicode


https://en.wikipedia.org/wiki/Unicode
Unicode

https://en.wikipedia.org/wiki/Comparison_of_Unicode_encodings
Comparison of Unicode encodings

https://en.wikipedia.org/wiki/List_of_binary_codes
List of binary codes


https://en.wikipedia.org/wiki/List_of_information_system_character_sets
List of information system character sets


and other related pages ...


With my best wishes for all .


Mehmet Erol Sanliturk

--0000000000003cc82805f3626254
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div class=3D"gmail_default" style=3D"fon=
t-family:monospace;font-size:large"><br></div></div><br><div class=3D"gmail=
_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Sun, Jan 29, 2023 at 10:16=
 AM Hans Petter Selasky &lt;<a href=3D"mailto:hps@selasky.org">hps@selasky.=
org</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"marg=
in:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1e=
x">On 1/29/23 01:54, Yuri wrote:<br>
&gt; Looking into an issue with accents input for vt and cz (so<br>
&gt; /usr/share/vt/keymaps/cz.kbd) keyboard where some of the accents are<b=
r>
&gt; working and other result weird unrelated characters output.<br>
&gt; <br>
&gt; Checking kbdcontrol -d output, there is an obvious difference with<br>
&gt; keymap contents -- all mappings are trimmed down to 1 byte after readi=
ng:<br>
&gt; <br>
&gt; kbdcontrol:<br>
&gt;=C2=A0 =C2=A0 dacu=C2=A0 180=C2=A0 ( 180 180 ) ( &#39;S&#39; &#39;Z&#39=
; ) ( &#39;Z&#39; &#39;y&#39; ) ( &#39;s&#39; &#39;[&#39; )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;z&#39; &#=
39;z&#39; ) ( &#39;R&#39; &#39;T&#39; ) ( &#39;A&#39; 193 ) ( &#39;L&#39; &=
#39;9&#39; )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;C&#39; 00=
6 ) ( &#39;E&#39; 201 ) ( &#39;I&#39; 205 ) ( &#39;N&#39; &#39;C&#39; )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;O&#39; 21=
1 ) ( &#39;U&#39; 218 ) ( &#39;Y&#39; 221 ) ( &#39;r&#39; &#39;U&#39; )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;a&#39; 22=
5 ) ( &#39;l&#39; &#39;:&#39; ) ( &#39;c&#39; 007 ) ( &#39;e&#39; 233 )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;i&#39; 23=
7 ) ( &#39;n&#39; &#39;D&#39; ) ( &#39;o&#39; 243 ) ( &#39;u&#39; 250 )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;y&#39; 25=
3 )<br>
&gt; <br>
&gt; keymap:<br>
&gt;=C2=A0 =C2=A0 dacu 0xb4=C2=A0 =C2=A0 ( 0xb4=C2=A0 =C2=A00xb4=C2=A0 =C2=
=A0 ) ( &#39;S&#39;=C2=A0 =C2=A0 0x015a=C2=A0 ) ( &#39;Z&#39;=C2=A0 =C2=A0 =
0x0179=C2=A0 )<br>
&gt; ( &#39;s&#39;=C2=A0 =C2=A0 0x015b=C2=A0 )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;z&=
#39;=C2=A0 =C2=A0 0x017a=C2=A0 ) ( &#39;R&#39;=C2=A0 =C2=A0 0x0154=C2=A0 ) =
( &#39;A&#39;=C2=A0 =C2=A0 0xc1=C2=A0 =C2=A0 )<br>
&gt; ( &#39;L&#39;=C2=A0 =C2=A0 0x0139=C2=A0 )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;C&=
#39;=C2=A0 =C2=A0 0x0106=C2=A0 ) ( &#39;E&#39;=C2=A0 =C2=A0 0xc9=C2=A0 =C2=
=A0 ) ( &#39;I&#39;=C2=A0 =C2=A0 0xcd=C2=A0 =C2=A0 )<br>
&gt; ( &#39;N&#39;=C2=A0 =C2=A0 0x0143=C2=A0 )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;O&=
#39;=C2=A0 =C2=A0 0xd3=C2=A0 =C2=A0 ) ( &#39;U&#39;=C2=A0 =C2=A0 0xda=C2=A0=
 =C2=A0 ) ( &#39;Y&#39;=C2=A0 =C2=A0 0xdd=C2=A0 =C2=A0 )<br>
&gt; ( &#39;r&#39;=C2=A0 =C2=A0 0x0155=C2=A0 )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;a&=
#39;=C2=A0 =C2=A0 0xe1=C2=A0 =C2=A0 ) ( &#39;l&#39;=C2=A0 =C2=A0 0x013a=C2=
=A0 ) ( &#39;c&#39;=C2=A0 =C2=A0 0x0107=C2=A0 )<br>
&gt; ( &#39;e&#39;=C2=A0 =C2=A0 0xe9=C2=A0 =C2=A0 )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;i&=
#39;=C2=A0 =C2=A0 0xed=C2=A0 =C2=A0 ) ( &#39;n&#39;=C2=A0 =C2=A0 0x0144=C2=
=A0 ) ( &#39;o&#39;=C2=A0 =C2=A0 0xf3=C2=A0 =C2=A0 )<br>
&gt; ( &#39;u&#39;=C2=A0 =C2=A0 0xfa=C2=A0 =C2=A0 )<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0( &#39;y&=
#39;=C2=A0 =C2=A0 0xfd=C2=A0 =C2=A0 )<br>
&gt; <br>
&gt; Source of the problem is the following definition in sys/sys/kbio.h:<b=
r>
&gt; <br>
&gt; struct acc_t {<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 u_char=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 accchar;<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 u_char=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 map[NUM_ACCENTCHARS][2];<br>
&gt; };<br>
&gt; <br>
&gt; While the keymaps were converted to have the unicode characters for vt=
<br>
&gt; in the commit below, the array to store them (map) was missed, or was<=
br>
&gt; there a reason for this?<br>
&gt; <br>
&gt; ---<br>
&gt; commit 7ba08f814546ece02e0193edc12cf6eb4d5cb8d4<br>
&gt; Author: Stefan E=C3=9Fer &lt;se@FreeBSD.org&gt;<br>
&gt; Date:=C2=A0 =C2=A0Sun Aug 17 19:54:21 2014 +0000<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0 Attempt at converting the SYSCONS keymaps to Unico=
de for use with<br>
&gt; NEWCONS.<br>
&gt;=C2=A0 =C2=A0 =C2=A0 I have spent many hours comparing source and desti=
nation formats,<br>
&gt; and hope<br>
&gt;=C2=A0 =C2=A0 =C2=A0 to have caught the most severe conversion errors.<=
br>
&gt; ---<br>
&gt; <br>
&gt; I have tried the following patch and it allows me to enter all accents=
<br>
&gt; documented in the keymap, though I must admit I&#39;m not sure it does=
 not<br>
&gt; have hidden issues:<br>
&gt; <br>
&gt; diff --git a/sys/sys/kbio.h b/sys/sys/kbio.h<br>
&gt; index 7f17bda76c5..fffeb63e226 100644<br>
&gt; --- a/sys/sys/kbio.h<br>
&gt; +++ b/sys/sys/kbio.h<br>
&gt; @@ -200,7 +200,7 @@ typedef struct okeymap okeymap_t;<br>
&gt; <br>
&gt;=C2=A0 =C2=A0struct acc_t {<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 u_char=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 accchar;<br>
&gt; -=C2=A0 =C2=A0 =C2=A0 =C2=A0u_char=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 m=
ap[NUM_ACCENTCHARS][2];<br>
&gt; +=C2=A0 =C2=A0 =C2=A0 =C2=A0int=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0map[NUM_ACCENTCHARS][2];<br>
&gt;=C2=A0 =C2=A0};<br>
&gt; <br>
<br>
Hi,<br>
<br>
Using &quot;int&quot; for unicode characters is probably good for now. Your=
 patch <br>
looks good, but please also consider the &quot;umlaut&quot; case while at i=
t <br>
(multiple characters that become one)!<br>
<br>
--HPS<br>
<br></blockquote><div><br></div><div><br></div><div><div style=3D"font-fami=
ly:monospace;font-size:large" class=3D"gmail_default"></div><br></div><div>=
<div style=3D"font-family:monospace;font-size:large" class=3D"gmail_default=
">I am not an expert on UNICODE .</div><div style=3D"font-family:monospace;=
font-size:large" class=3D"gmail_default"><br></div><div style=3D"font-famil=
y:monospace;font-size:large" class=3D"gmail_default">When character sets ar=
e considered , a homogeneous definition for all of the FreeBSD system would=
 be more useful .</div><div style=3D"font-family:monospace;font-size:large"=
 class=3D"gmail_default">There are mainly three types of Unicode : UTF-8 , =
UTF-16 , and UTF-32 where numbers are bit sizes of the characters .<br></di=
v><div style=3D"font-family:monospace;font-size:large" class=3D"gmail_defau=
lt"><br></div><div style=3D"font-family:monospace;font-size:large" class=3D=
"gmail_default"><br></div><div style=3D"font-family:monospace;font-size:lar=
ge" class=3D"gmail_default">Some pages about Unicode where they have many l=
inked sub pages :</div><div style=3D"font-family:monospace;font-size:large"=
 class=3D"gmail_default"><br></div><div style=3D"font-family:monospace;font=
-size:large" class=3D"gmail_default"><br></div><div style=3D"font-family:mo=
nospace;font-size:large" class=3D"gmail_default"><br></div><div style=3D"fo=
nt-family:monospace;font-size:large" class=3D"gmail_default"><a href=3D"htt=
ps://en.wikipedia.org/wiki/Category:Unicode">https://en.wikipedia.org/wiki/=
Category:Unicode</a></div></div>Category:Unicode<div><br></div><div><br></d=
iv><div><a href=3D"https://en.wikipedia.org/wiki/Unicode">https://en.wikipe=
dia.org/wiki/Unicode</a></div>Unicode<div><br></div><div><a href=3D"https:/=
/en.wikipedia.org/wiki/Comparison_of_Unicode_encodings">https://en.wikipedi=
a.org/wiki/Comparison_of_Unicode_encodings</a></div><div>Comparison of Unic=
ode encodings</div><div></div><div><br></div><div><a href=3D"https://en.wik=
ipedia.org/wiki/List_of_binary_codes">https://en.wikipedia.org/wiki/List_of=
_binary_codes</a></div><div>List of binary codes</div><div><br></div><div><=
br></div><div><a href=3D"https://en.wikipedia.org/wiki/List_of_information_=
system_character_sets">https://en.wikipedia.org/wiki/List_of_information_sy=
stem_character_sets</a></div><div>List of information system character sets=
</div><div><br></div><div><br></div><div><div style=3D"font-family:monospac=
e;font-size:large" class=3D"gmail_default">and other related pages ...</div=
><div style=3D"font-family:monospace;font-size:large" class=3D"gmail_defaul=
t"><br></div><div style=3D"font-family:monospace;font-size:large" class=3D"=
gmail_default"><br></div><div style=3D"font-family:monospace;font-size:larg=
e" class=3D"gmail_default">With my best wishes for all .</div><div style=3D=
"font-family:monospace;font-size:large" class=3D"gmail_default"><br></div><=
div style=3D"font-family:monospace;font-size:large" class=3D"gmail_default"=
><br></div><div style=3D"font-family:monospace;font-size:large" class=3D"gm=
ail_default">Mehmet Erol Sanliturk</div><br></div><div><br></div><div><br><=
/div><div><br></div><div><br></div><div>=C2=A0</div></div></div>

--0000000000003cc82805f3626254--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOgwaMv1GdXxsc_8n6NbtnnXe39w0V6JqZR4ro_pNtLttJKY2w>