Date: Wed, 13 Nov 2013 18:48:51 +0100 From: "Julian H. Stacey" <jhs@berklix.com> To: hackers@freebsd.org Cc: Jordan Hubbard <jkh@turbofuzz.com>, FreeBSD-gnats-submit@freebsd.org, "Bernhard Riedel \(Work\)" <bernhard@sdg.de>, Astrid Jekat <astrid@jekat.com>, Christian Weisgerber <naddy@mips.inka.de> Subject: Re: patch for /usr/src/usr.bin/fmt/ (not 8 bit clean) for German & French Message-ID: <201311131748.rADHmpxG084992@fire.js.berklix.net> In-Reply-To: Your message "Tue, 12 Nov 2013 21:17:37 %2B0100." <20131112201737.GA52200@lorvorc.mips.inka.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Christian Weisgerber wrote: > Julian H. Stacey: > > > I don't know about ISO 8859-1 and UTF-8, (I dislike & avoid > > national char set stuff as much as possible), but I want > > That is your problem right there. My perspective & experience or `problem' as you mislabel it, is I was supporting Unix Internationalisation back in 1985, & long since tired of agravating German umlauts issues (Umlauts even back then had AE OE UE [& SS] replacements but few used them). Your problem is being German you had an incentive to attain umlauts, & probably being younger, wasted less time achieving umlauts going straight to the since available UTF; but myopic that others may be averse to waste more time for superflous national oddities that cleaner Roman derivatives like Italian & English etc find superfluous. It seemed best to make fmt.c 8 bit clean[er], to help process arbitrary text, harm no one, & not disturb users of eg UTF. Your problem is you would obstruct a cleaner fmt, so fmt continues to fail until users are forced to waste their time too like you did, reading & configuring internationalisation variables some don't need. ** > > to be able to edit files that simultaneously contain eg all > > of English German & French etc, so setting some var to eg > > just German would be inappropriate. 8 bit clean would be ideal, > > next best would be my patches I suppose. > > You MUST define a character set for this. "8-bit clean" is meaningless > for a tool that deals with runs of characters. Without a defined > character set, you have no idea what those bytes mean. Is 0x90 a Not true. See below. ** > printable character? Is it a control character? Is it part of a > multibyte character? > > And setting, for example, LC_CTYPE=de_DE.ISO8859-1 does in no way > limit you to German. For LC_CTYPE purposes, the language/country > part of the locale specification isn't used. > > This is definitely a PEBKAC. Avoid junk acronyms. Re-Read original post http://lists.freebsd.org/pipermail/freebsd-hackers/2010-May/031901.html Particularly: Example: Pasting notes into an xterm, clauses from http://seafrance.com in English then French original & German, to get the feel of what an unclear English translation **: Sometimes I mouse paste from Firefox in English, French, German & other languages, making notes in a single file with vi in an xterm, all with standard env. no Locale. & it edits OK in vi, & displays with cat in xterm, till !}fmt in vi wraps long lines, when fmt breaks it. So I fixed fmt. It would Not be appropriate to set a German locale, nor a French etc. Other utils might misbehave now or later See eg man sort re LC_ALL. No way I'd keep exiting vi & resetting LC_CTYPE between mouse pastes from different language pages, The default American works fine. I'm not bothered if vi+xterm might mis-display some odd accent, as I can see something is there, so long as fmt does not strip the accent, but FreeBSD fmt.c Does strip the French accents & German umlauts, that's why I fixed fmt.c Summary: Making fmt.c 8 bit cleaner would not break UTF & unicode I believe so no reason to object to removal of fmt.c '& 0x7f' cruft etc. Cheers, Julian -- Julian Stacey, BSD Unix Linux C Sys Eng Consultant, Munich http://berklix.com Interleave replies below like a play script. Indent old text with "> ". Send plain text, not quoted-printable, HTML, base64, or multipart/alternative. Extradite NSA spy chief Alexander. http://berklix.eu/jhs/blog/2013_10_30
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201311131748.rADHmpxG084992>