Date: Thu, 24 Mar 2022 09:31:33 -0600 From: Warner Losh <imp@bsdimp.com> To: "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net> Cc: Phil Shafer <phil@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>, "Simon J. Gerraty" <sjg@freebsd.org> Subject: Re: What's the locale for system files (e.g. /etc/fstab)? Message-ID: <CANCZdfp1oJdC2HfU63U_3y4y%2BQE0TswdVSg%2Big4uS3RJC3yK3w@mail.gmail.com> In-Reply-To: <202203241519.22OFJ3Mk098649@gndrsh.dnsmgr.net> References: <70B211BB-15BA-47A4-8F9C-C833AA8C1EAA@freebsd.org> <202203241519.22OFJ3Mk098649@gndrsh.dnsmgr.net>
next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000002f5cc205daf88b12 Content-Type: text/plain; charset="UTF-8" On Thu, Mar 24, 2022, 9:20 AM Rodney W. Grimes < freebsd-rwg@gndrsh.dnsmgr.net> wrote: > > On 23 Mar 2022, at 11:51, Piotr Pawel Stefaniak wrote: > > > mount: make libxo support more locale-aware > > > > > > "special", "node", and "mounter" are not guaranteed to be encoded > > > with > > > UTF-8. Use the appropriate modifier. > > > > > > - xo_emit("{:special}{L: on }{:node}{L: (}{:fstype}", > > > sfp->f_mntfromname, > > > + xo_emit("{:special/%hs}{L: on }{:node/%hs}{L: (}{:fstype}", > > > sfp->f_mntfromname, > > sfp->f_mntonname, sfp->f_fstypename); > > > > This recent "mount" patch highlights a libxo-related problem for which I > > don't have a solution: > > > > There are several files for which the encoding is not known. Since > > locale is user specific, we don't know how to interpret the contents of > > /etc/fstab. It's assumably been encoded with the format of the user who > > wrote it, but that information is lost. > > Since you say "locale is user specific" it makes me want to say that > this should come from the environment set by "default:" in /etc/login.conf, > no need for a new file or anything special. > Config files, like fstab, have no locale and parsing them with a locale leads to errors, even when the user or the system has a nondefault locale. > > > Put more generally, there's not a system-wide place which declares the > > encoding for system files, which leads to this problem where we > > interpret files from one user's locale using another user's locale. > > Well /etc/login.conf *IS* a system wide declaration of this type of > stuff, both lang= and charset= are declared there. > Since system wide files like yhese are always parsed without a locale, this information is correct, but I'm not sure how it applies. It is always C.UTF-8. Anything else may, or may not, work based on accidents of coincident encoding. Not everything can change locales, and the fstab and other parsing routines in libc assume C.UTF-8 or even just the ascii-7/8 subset. > > > One solution would a symlink in /etc that "points to" the name of the > > current system-wide locale name. > > > > % ls -Fl /etc/locale > > lrwxr-xr-x 1 root wheel 7 Mar 23 15:42 /etc/locale@ -> C.UTF-8 > > grep lang /etc/login.conf: > :lang=C.UTF-8: > :lang=ru_RU.UTF-8:\ > > Probably what you want? > You can get this with the locale routines, no? No need for grep. Warner > > > (Or "/etc/system.locale" ?) > > > > If the symlink doesn't exist, would "C.UTF-8" be a suitable default > > moving forwards? It certainly would not be backwards compatible, since > > an existing fstab could have non-UTF-8 strings in it, encoded with the > > locale of the user who touched the file. But there's really no > > backwards compatible solution, given that there's no guarantee that (for > > any specific FreeBSD system) all system files were written with the same > > locale. Fun, eh? ;^) > > > > Opinions, thoughts, please? > > > > Thanks, > > Phil > > > > > > -- > Rod Grimes > rgrimes@freebsd.org > > --0000000000002f5cc205daf88b12 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto"><div><br><br><div class=3D"gmail_quote"><div dir=3D"ltr" = class=3D"gmail_attr">On Thu, Mar 24, 2022, 9:20 AM Rodney W. Grimes <<a = href=3D"mailto:freebsd-rwg@gndrsh.dnsmgr.net">freebsd-rwg@gndrsh.dnsmgr.net= </a>> wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:= 0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">> On 23 Mar 2022= , at 11:51, Piotr Pawel Stefaniak wrote:<br> > > mount: make libxo support more locale-aware<br> > ><br> > >=C2=A0 =C2=A0 "special", "node", and "mou= nter" are not guaranteed to be encoded <br> > > with<br> > >=C2=A0 =C2=A0 UTF-8. Use the appropriate modifier.<br> > ><br> > > -=C2=A0 =C2=A0 =C2=A0 =C2=A0xo_emit("{:special}{L: on }{:nod= e}{L: (}{:fstype}", <br> > > sfp->f_mntfromname,<br> > > +=C2=A0 =C2=A0 =C2=A0 =C2=A0xo_emit("{:special/%hs}{L: on }{= :node/%hs}{L: (}{:fstype}", <br> > > sfp->f_mntfromname,<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 sfp->f_mntonname, s= fp->f_fstypename);<br> > <br> > This recent "mount" patch highlights a libxo-related problem= for which I <br> > don't have a solution:<br> > <br> > There are several files for which the encoding is not known.=C2=A0 Sin= ce <br> > locale is user specific, we don't know how to interpret the conten= ts of <br> > /etc/fstab.=C2=A0 It's assumably been encoded with the format of t= he user who <br> > wrote it, but that information is lost.<br> <br> Since you say "locale is user specific" it makes me want to say t= hat<br> this should come from the environment set by "default:" in /etc/l= ogin.conf,<br> no need for a new file or anything special.<br></blockquote></div></div><di= v dir=3D"auto"><br></div><div dir=3D"auto">Config files, like fstab, have n= o locale and parsing them with a locale leads to errors, even when the user= or the system has a nondefault locale.=C2=A0</div><div dir=3D"auto"><br></= div><div dir=3D"auto"><div class=3D"gmail_quote"><blockquote class=3D"gmail= _quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:= 1ex"> > <br> > Put more generally, there's not a system-wide place which declares= the <br> > encoding for system files, which leads to this problem where we <br> > interpret files from one user's locale using another user's lo= cale.<br> <br> Well /etc/login.conf *IS* a system wide declaration of this type of<br> stuff, both lang=3D and charset=3D are declared there.<br></blockquote></di= v></div><div dir=3D"auto"><br></div><div dir=3D"auto">Since system wide fil= es like yhese are always parsed without a locale, this information is corre= ct, but I'm not sure how it applies.</div><div dir=3D"auto"><br></div><= div dir=3D"auto">It is always=C2=A0 C.UTF-8. Anything else may, or may not,= work based on accidents of coincident encoding. Not everything can change = locales, and the fstab and other parsing routines in libc assume C.UTF-8 or= even just the ascii-7/8 subset.</div><div dir=3D"auto"><br></div><div dir= =3D"auto"><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" styl= e=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> > <br> > One solution would a symlink in /etc that "points to" the na= me of the <br> > current system-wide locale name.<br> > <br> > % ls -Fl /etc/locale<br> > lrwxr-xr-x=C2=A0 1 root=C2=A0 wheel=C2=A0 7 Mar 23 15:42 /etc/locale@ = -> C.UTF-8<br> <br> grep lang /etc/login.conf:<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 :lang=3DC.UTF-8:<br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 :lang=3Dru_RU.UTF-8:\<br> <br> Probably what you want?<br></blockquote></div></div><div dir=3D"auto"><br><= /div><div dir=3D"auto">You can get this with the locale routines, no? No ne= ed for grep.</div><div dir=3D"auto"><br></div><div dir=3D"auto">Warner</div= ><div dir=3D"auto"><br></div><div dir=3D"auto"><div class=3D"gmail_quote"><= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex"> > <br> > (Or "/etc/system.locale" ?)<br> > <br> > If the symlink doesn't exist, would "C.UTF-8" be a suita= ble default <br> > moving forwards?=C2=A0 It certainly would not be backwards compatible,= since <br> > an existing fstab could have non-UTF-8 strings in it, encoded with the= <br> > locale of the user who touched the file.=C2=A0 But there's really = no <br> > backwards compatible solution, given that there's no guarantee tha= t (for <br> > any specific FreeBSD system) all system files were written with the sa= me <br> > locale.=C2=A0 Fun, eh? ;^)<br> > <br> > Opinions, thoughts, please?<br> > <br> > Thanks,<br> >=C2=A0 =C2=A0Phil<br> > <br> > <br> <br> -- <br> Rod Grimes=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<a href=3D"mailto:rgrimes@freebsd.org= " target=3D"_blank" rel=3D"noreferrer">rgrimes@freebsd.org</a><br> <br> </blockquote></div></div></div> --0000000000002f5cc205daf88b12--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfp1oJdC2HfU63U_3y4y%2BQE0TswdVSg%2Big4uS3RJC3yK3w>