Date: Mon, 16 Oct 1995 18:20:35 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: ache@astral.msk.su (=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=) Cc: terry@lambert.org, hackers@freefall.freebsd.org, kaleb@x.org Subject: Re: A couple problems in FreeBSD 2.1.0-950922-SNAP Message-ID: <199510170120.SAA26017@phaeton.artisoft.com> In-Reply-To: <yZsrkWmKU1@ache.dialup.demos.ru> from "=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=" at Oct 17, 95 02:40:38 am
next in thread | previous in thread | raw e-mail | index | archive | help
> >This is valid for all 8859-x display/input systems, since the reuse of > >the code points are not transformed by this (8859-x does not encode > >characters in those locations). > > You consider one very simple case (isprint/iscontrol only) and think > that it is a proof. What you can say about ispunct() f.e.? > It is clearly differ into 8859-1 and 8859-5 f.e., islower/isupper differs > too. tolower/toupper differs too. Even isalpha differs. What did I say before about lobbying international standards bodies to replace 8859-5? I don't know if I buy the [is,to][upper,lower] distinctions. I think they are mainly for undefined code points, and getting the wrong result in an undefined are is not a problem. > >The only potentially incorrect behaviour is on blanks not being interpreted > >as blanks. If you want a blank, you shouldn't be using some wild code > >point other than 0x20 anyway. You get what you deserve. > > Well, isspace differs too. Space isn't 0x20 in 8859-5? Tab, LF, CR aren't the same? > >The problems you will encounter in this circumstance are all *very* > >specific to cases where a single file system is being used by multiple > >nationalities of clients. > > No it is different problem. By setting LANG for something != 8859-1 > (for programs that understands it) I assume that programs which > not understands it still works right. > If they are strict ASCII, I automatically protected from any > unwanted effects. If they are 8859-1 I need to classify > various unwanted effects for each != 8859-1 charset as > 'default undefined behaviour'. I agree. And this is precisely the problem with the crt0.o/setlocale() hack. You are implicitly removing the protection from unwanted effects. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199510170120.SAA26017>