Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Mar 2022 09:31:33 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net>
Cc:        Phil Shafer <phil@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>,  "Simon J. Gerraty" <sjg@freebsd.org>
Subject:   Re: What's the locale for system files (e.g. /etc/fstab)?
Message-ID:  <CANCZdfp1oJdC2HfU63U_3y4y%2BQE0TswdVSg%2Big4uS3RJC3yK3w@mail.gmail.com>
In-Reply-To: <202203241519.22OFJ3Mk098649@gndrsh.dnsmgr.net>
References:  <70B211BB-15BA-47A4-8F9C-C833AA8C1EAA@freebsd.org> <202203241519.22OFJ3Mk098649@gndrsh.dnsmgr.net>

next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000002f5cc205daf88b12
Content-Type: text/plain; charset="UTF-8"

On Thu, Mar 24, 2022, 9:20 AM Rodney W. Grimes <
freebsd-rwg@gndrsh.dnsmgr.net> wrote:

> > On 23 Mar 2022, at 11:51, Piotr Pawel Stefaniak wrote:
> > > mount: make libxo support more locale-aware
> > >
> > >    "special", "node", and "mounter" are not guaranteed to be encoded
> > > with
> > >    UTF-8. Use the appropriate modifier.
> > >
> > > -       xo_emit("{:special}{L: on }{:node}{L: (}{:fstype}",
> > > sfp->f_mntfromname,
> > > +       xo_emit("{:special/%hs}{L: on }{:node/%hs}{L: (}{:fstype}",
> > > sfp->f_mntfromname,
> >              sfp->f_mntonname, sfp->f_fstypename);
> >
> > This recent "mount" patch highlights a libxo-related problem for which I
> > don't have a solution:
> >
> > There are several files for which the encoding is not known.  Since
> > locale is user specific, we don't know how to interpret the contents of
> > /etc/fstab.  It's assumably been encoded with the format of the user who
> > wrote it, but that information is lost.
>
> Since you say "locale is user specific" it makes me want to say that
> this should come from the environment set by "default:" in /etc/login.conf,
> no need for a new file or anything special.
>

Config files, like fstab, have no locale and parsing them with a locale
leads to errors, even when the user or the system has a nondefault locale.

>
> > Put more generally, there's not a system-wide place which declares the
> > encoding for system files, which leads to this problem where we
> > interpret files from one user's locale using another user's locale.
>
> Well /etc/login.conf *IS* a system wide declaration of this type of
> stuff, both lang= and charset= are declared there.
>

Since system wide files like yhese are always parsed without a locale, this
information is correct, but I'm not sure how it applies.

It is always  C.UTF-8. Anything else may, or may not, work based on
accidents of coincident encoding. Not everything can change locales, and
the fstab and other parsing routines in libc assume C.UTF-8 or even just
the ascii-7/8 subset.

>
> > One solution would a symlink in /etc that "points to" the name of the
> > current system-wide locale name.
> >
> > % ls -Fl /etc/locale
> > lrwxr-xr-x  1 root  wheel  7 Mar 23 15:42 /etc/locale@ -> C.UTF-8
>
> grep lang /etc/login.conf:
>         :lang=C.UTF-8:
>         :lang=ru_RU.UTF-8:\
>
> Probably what you want?
>

You can get this with the locale routines, no? No need for grep.

Warner

>
> > (Or "/etc/system.locale" ?)
> >
> > If the symlink doesn't exist, would "C.UTF-8" be a suitable default
> > moving forwards?  It certainly would not be backwards compatible, since
> > an existing fstab could have non-UTF-8 strings in it, encoded with the
> > locale of the user who touched the file.  But there's really no
> > backwards compatible solution, given that there's no guarantee that (for
> > any specific FreeBSD system) all system files were written with the same
> > locale.  Fun, eh? ;^)
> >
> > Opinions, thoughts, please?
> >
> > Thanks,
> >   Phil
> >
> >
>
> --
> Rod Grimes
> rgrimes@freebsd.org
>
>

--0000000000002f5cc205daf88b12
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"auto"><div><br><br><div class=3D"gmail_quote"><div dir=3D"ltr" =
class=3D"gmail_attr">On Thu, Mar 24, 2022, 9:20 AM Rodney W. Grimes &lt;<a =
href=3D"mailto:freebsd-rwg@gndrsh.dnsmgr.net">freebsd-rwg@gndrsh.dnsmgr.net=
</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:=
0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">&gt; On 23 Mar 2022=
, at 11:51, Piotr Pawel Stefaniak wrote:<br>
&gt; &gt; mount: make libxo support more locale-aware<br>
&gt; &gt;<br>
&gt; &gt;=C2=A0 =C2=A0 &quot;special&quot;, &quot;node&quot;, and &quot;mou=
nter&quot; are not guaranteed to be encoded <br>
&gt; &gt; with<br>
&gt; &gt;=C2=A0 =C2=A0 UTF-8. Use the appropriate modifier.<br>
&gt; &gt;<br>
&gt; &gt; -=C2=A0 =C2=A0 =C2=A0 =C2=A0xo_emit(&quot;{:special}{L: on }{:nod=
e}{L: (}{:fstype}&quot;, <br>
&gt; &gt; sfp-&gt;f_mntfromname,<br>
&gt; &gt; +=C2=A0 =C2=A0 =C2=A0 =C2=A0xo_emit(&quot;{:special/%hs}{L: on }{=
:node/%hs}{L: (}{:fstype}&quot;, <br>
&gt; &gt; sfp-&gt;f_mntfromname,<br>
&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sfp-&gt;f_mntonname, s=
fp-&gt;f_fstypename);<br>
&gt; <br>
&gt; This recent &quot;mount&quot; patch highlights a libxo-related problem=
 for which I <br>
&gt; don&#39;t have a solution:<br>
&gt; <br>
&gt; There are several files for which the encoding is not known.=C2=A0 Sin=
ce <br>
&gt; locale is user specific, we don&#39;t know how to interpret the conten=
ts of <br>
&gt; /etc/fstab.=C2=A0 It&#39;s assumably been encoded with the format of t=
he user who <br>
&gt; wrote it, but that information is lost.<br>
<br>
Since you say &quot;locale is user specific&quot; it makes me want to say t=
hat<br>
this should come from the environment set by &quot;default:&quot; in /etc/l=
ogin.conf,<br>
no need for a new file or anything special.<br></blockquote></div></div><di=
v dir=3D"auto"><br></div><div dir=3D"auto">Config files, like fstab, have n=
o locale and parsing them with a locale leads to errors, even when the user=
 or the system has a nondefault locale.=C2=A0</div><div dir=3D"auto"><br></=
div><div dir=3D"auto"><div class=3D"gmail_quote"><blockquote class=3D"gmail=
_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:=
1ex">
&gt; <br>
&gt; Put more generally, there&#39;s not a system-wide place which declares=
 the <br>
&gt; encoding for system files, which leads to this problem where we <br>
&gt; interpret files from one user&#39;s locale using another user&#39;s lo=
cale.<br>
<br>
Well /etc/login.conf *IS* a system wide declaration of this type of<br>
stuff, both lang=3D and charset=3D are declared there.<br></blockquote></di=
v></div><div dir=3D"auto"><br></div><div dir=3D"auto">Since system wide fil=
es like yhese are always parsed without a locale, this information is corre=
ct, but I&#39;m not sure how it applies.</div><div dir=3D"auto"><br></div><=
div dir=3D"auto">It is always=C2=A0 C.UTF-8. Anything else may, or may not,=
 work based on accidents of coincident encoding. Not everything can change =
locales, and the fstab and other parsing routines in libc assume C.UTF-8 or=
 even just the ascii-7/8 subset.</div><div dir=3D"auto"><br></div><div dir=
=3D"auto"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" styl=
e=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
&gt; <br>
&gt; One solution would a symlink in /etc that &quot;points to&quot; the na=
me of the <br>
&gt; current system-wide locale name.<br>
&gt; <br>
&gt; % ls -Fl /etc/locale<br>
&gt; lrwxr-xr-x=C2=A0 1 root=C2=A0 wheel=C2=A0 7 Mar 23 15:42 /etc/locale@ =
-&gt; C.UTF-8<br>
<br>
grep lang /etc/login.conf:<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 :lang=3DC.UTF-8:<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 :lang=3Dru_RU.UTF-8:\<br>
<br>
Probably what you want?<br></blockquote></div></div><div dir=3D"auto"><br><=
/div><div dir=3D"auto">You can get this with the locale routines, no? No ne=
ed for grep.</div><div dir=3D"auto"><br></div><div dir=3D"auto">Warner</div=
><div dir=3D"auto"><br></div><div dir=3D"auto"><div class=3D"gmail_quote"><=
blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px=
 #ccc solid;padding-left:1ex">
&gt; <br>
&gt; (Or &quot;/etc/system.locale&quot; ?)<br>
&gt; <br>
&gt; If the symlink doesn&#39;t exist, would &quot;C.UTF-8&quot; be a suita=
ble default <br>
&gt; moving forwards?=C2=A0 It certainly would not be backwards compatible,=
 since <br>
&gt; an existing fstab could have non-UTF-8 strings in it, encoded with the=
 <br>
&gt; locale of the user who touched the file.=C2=A0 But there&#39;s really =
no <br>
&gt; backwards compatible solution, given that there&#39;s no guarantee tha=
t (for <br>
&gt; any specific FreeBSD system) all system files were written with the sa=
me <br>
&gt; locale.=C2=A0 Fun, eh? ;^)<br>
&gt; <br>
&gt; Opinions, thoughts, please?<br>
&gt; <br>
&gt; Thanks,<br>
&gt;=C2=A0 =C2=A0Phil<br>
&gt; <br>
&gt; <br>
<br>
-- <br>
Rod Grimes=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<a href=3D"mailto:rgrimes@freebsd.org=
" target=3D"_blank" rel=3D"noreferrer">rgrimes@freebsd.org</a><br>
<br>
</blockquote></div></div></div>

--0000000000002f5cc205daf88b12--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfp1oJdC2HfU63U_3y4y%2BQE0TswdVSg%2Big4uS3RJC3yK3w>