Date: Thu, 24 Mar 2022 13:12:10 -0600 From: Warner Losh <imp@bsdimp.com> To: "Simon J. Gerraty" <sjg@juniper.net> Cc: "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net>, Phil Shafer <phil@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: What's the locale for system files (e.g. /etc/fstab)? Message-ID: <CANCZdfrZjeU_%2BLRew9BOCdktDi3aTUoeEaBkrov9FccvwfaN0g@mail.gmail.com> In-Reply-To: <71356.1648139436@kaos.jnpr.net> References: <70B211BB-15BA-47A4-8F9C-C833AA8C1EAA@freebsd.org> <202203241519.22OFJ3Mk098649@gndrsh.dnsmgr.net> <CANCZdfp1oJdC2HfU63U_3y4y%2BQE0TswdVSg%2Big4uS3RJC3yK3w@mail.gmail.com> <71356.1648139436@kaos.jnpr.net>
next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000002b090005dafba03c Content-Type: text/plain; charset="UTF-8" On Thu, Mar 24, 2022, 10:30 AM Simon J. Gerraty <sjg@juniper.net> wrote: > Warner Losh <imp@bsdimp.com> wrote: > > Config files, like fstab, have no locale and parsing them with a locale > leads to errors, even when the user or the system has a nondefault locale. > > > > > > > > Put more generally, there's not a system-wide place which declares the > > > encoding for system files, which leads to this problem where we > > > interpret files from one user's locale using another user's locale. > > > > Well /etc/login.conf *IS* a system wide declaration of this type of > > stuff, both lang= and charset= are declared there. > > > > Since system wide files like yhese are always parsed without a locale, > this information is correct, but I'm not sure how it applies. > > > > It is always C.UTF-8. Anything else may, or may not, work based on > accidents of coincident encoding. Not everything can change locales, and > the fstab and other parsing routines in libc assume C.UTF-8 or even just > the ascii-7/8 subset. > > > > > > > > One solution would a symlink in /etc that "points to" the name of the > > > current system-wide locale name. > > > > > > % ls -Fl /etc/locale > > > lrwxr-xr-x 1 root wheel 7 Mar 23 15:42 /etc/locale@ -> C.UTF-8 > > > > grep lang /etc/login.conf: > > :lang=C.UTF-8: > > :lang=ru_RU.UTF-8:\ > > > > Probably what you want? > > I doubt it, one is from the entry for Russian users ;-) > > > > > You can get this with the locale routines, no? No need for grep. > > I suspect not. > > AFAIK virtually everything about locale support tells you about the > locale for the current process - which does not necessarily inform you > of the locale that was in effect when a system file was last edited. > > I don't even know if it is guaranteed that everything that reads system > files groks random locales - or what happens when you have 3 admins each > prefering a different locale, do different entries in fstab for example > get impacted and the result thus not readable by anyone? > > There's probably something to be said for enforcing something like > C.UTF-8 for system files. > That is the primary reason for system files always being C.UTF-8... There is no way to tag it as anything else... and some of these files are often parsed from a context that can't set the locale, like the boot loader or the kernel... also, these files have a format that was defined back in the 7bit ascii time frame. They also don't make use of the text in a way that isn't literal... Having said that, I'm unsure how you'd mount /<kanji-for-neko> from fstab, or if that is well defined. The kernel just presents a string of bytes not containing /... Warner --sjg > --0000000000002b090005dafba03c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto"><div><br><br><div class=3D"gmail_quote"><div dir=3D"ltr" = class=3D"gmail_attr">On Thu, Mar 24, 2022, 10:30 AM Simon J. Gerraty <<a= href=3D"mailto:sjg@juniper.net">sjg@juniper.net</a>> wrote:<br></div><b= lockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px = #ccc solid;padding-left:1ex">Warner Losh <<a href=3D"mailto:imp@bsdimp.c= om" target=3D"_blank" rel=3D"noreferrer">imp@bsdimp.com</a>> wrote:<br> > Config files, like fstab, have no locale and parsing them with a local= e leads to errors, even when the user or the system has a nondefault locale= .<br> > <br> > ><br> > > Put more generally, there's not a system-wide place which dec= lares the<br> > > encoding for system files, which leads to this problem where we<b= r> > > interpret files from one user's locale using another user'= ;s locale.<br> > <br> > Well /etc/login.conf *IS* a system wide declaration of this type of<br= > > stuff, both lang=3D and charset=3D are declared there.<br> > <br> > Since system wide files like yhese are always parsed without a locale,= this information is correct, but I'm not sure how it applies.<br> > <br> > It is always=C2=A0 C.UTF-8. Anything else may, or may not, work based = on accidents of coincident encoding. Not everything can change locales, and= the fstab and other parsing routines in libc assume C.UTF-8 or even just t= he ascii-7/8 subset.<br> > <br> > ><br> > > One solution would a symlink in /etc that "points to" t= he name of the<br> > > current system-wide locale name.<br> > ><br> > > % ls -Fl /etc/locale<br> > > lrwxr-xr-x=C2=A0 1 root=C2=A0 wheel=C2=A0 7 Mar 23 15:42 /etc/loc= ale@ -> C.UTF-8<br> > <br> > grep lang /etc/login.conf:<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0:lang=3DC.UTF-8:<br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0:lang=3Dru_RU.UTF-8:\<br> > <br> > Probably what you want?<br> <br> I doubt it, one is from the entry for Russian users ;-)<br> <br> > <br> > You can get this with the locale routines, no? No need for grep.<br> <br> I suspect not.<br> <br> AFAIK virtually everything about locale support tells you about the<br> locale for the current process - which does not necessarily inform you<br> of the locale that was in effect when a system file was last edited.<br> <br> I don't even know if it is guaranteed that everything that reads system= <br> files groks random locales - or what happens when you have 3 admins each <b= r> prefering a different locale, do different entries in fstab for example<br> get impacted and the result thus not readable by anyone?<br> <br> There's probably something to be said for enforcing something like<br> C.UTF-8 for system files.<br></blockquote></div></div><div dir=3D"auto"><br= ></div><div dir=3D"auto">That is the primary reason for system files always= being C.UTF-8... There is no way to tag it as anything else... and some of= these files are often parsed from a context that can't set the locale,= like the boot loader or the kernel... also, these files have a format that= was defined back in the 7bit ascii time frame. They also don't make us= e of the text in a way that isn't literal...</div><div dir=3D"auto"><br= ></div><div dir=3D"auto">Having said that, I'm unsure how you'd mou= nt /<kanji-for-neko> from fstab, or if that is well defined. The kern= el just presents a string of bytes not containing /...</div><div dir=3D"aut= o"><br></div><div dir=3D"auto">Warner=C2=A0</div><div dir=3D"auto"><br></di= v><div dir=3D"auto"><div class=3D"gmail_quote"><blockquote class=3D"gmail_q= uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e= x"> --sjg<br> </blockquote></div></div></div> --0000000000002b090005dafba03c--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfrZjeU_%2BLRew9BOCdktDi3aTUoeEaBkrov9FccvwfaN0g>