Date: Tue, 19 Jun 2018 19:46:18 -0700 From: Conrad Meyer <cem@freebsd.org> To: Farhan Khan <khanzf@gmail.com> Cc: freebsd-hackers@freebsd.org Subject: Re: Printing UTF-8 characters Message-ID: <CAG6CVpU4cMjxydUUMkgGG6km4P9Mn5S07Zp-%2BBrfUwEx9bg8yQ@mail.gmail.com> In-Reply-To: <CAFd4kYAY-80U2hJHYw_-OBKwcjQnNgN19%2BjP1tu%2BF939aL6bZw@mail.gmail.com> References: <CAFd4kYD_Q9Y84LvCGELVodt%2B30KM_KzNzoLOzudZm9kaLqGPaQ@mail.gmail.com> <20180201072831.GA2239@c720-r314251> <CAFd4kYB_eU00Z5nBzp-iNGuELN4cy_ADGABb-boq4Fvn-a0XMg@mail.gmail.com> <20180202035130.C51F8156E80B@mail.bitblocks.com> <CAFd4kYAY-80U2hJHYw_-OBKwcjQnNgN19%2BjP1tu%2BF939aL6bZw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
You want LC_CTYPE. On Tue, Jun 19, 2018 at 6:38 PM Farhan Khan <khanzf@gmail.com> wrote: > On Thu, Feb 1, 2018 at 10:51 PM, Bakul Shah <bakul@bitblocks.com> wrote: > > On Thu, 01 Feb 2018 10:42:36 -0500 Farhan Khan <khanzf@gmail.com> wrote: > >> Sorry, that was a poorly phrased question on my part. Let me try again. > >> I am trying to make text align in columns in a terminal. My > >> understanding is that characters above 0x7E are 3 bytes in length. A > >> modern terminal will render that as either a single question-mark or > >> the character itself, making terminal column alignment easy. But how > >> would an older terminal display a 3-byte character? I am worried that > >> would render as 3 question marks and throw off column alignment. If > >> so, is there a proper way to perform alignment for both newer and > >> older terminals? > > > > UTF-8 can use upto 4 bytes to encode a unicode point, > > depending on the script. > > > > For what you want, you can use openoffice like programs that > > understand unicode and can do complex text layout. Normal > > terminal programs typically use monospace (fixed width) fonts > > are simply not capable of what you want. The assumption that > > one char means one rectangular cell on the screen is too > > deeply woven in them. Particularly for Indic languages this > > just doesn't work, You may have N unicode points, each of > > which require 3 bytes, all together map to a one single glyph. > > Hi all, > > To follow-up from my earlier poorly asked question from a few months > back, how do I determine if the terminal is capable of printing UTF-8 > encoded strings and/or unicode in general? > The obvious answer is to check the LANG variable via getenv(3), but > what if you are using "en_US.UTF-8" vs "en_GB.UTF-8"? Should I just > check for the string "UTF-8" in the LANG variable? > > My concern is printing characters above 0x7F on terminals/encodings > that are not capable of displaying them, resulting in unusual > behavior. > > Thanks, > > -- > Farhan Khan > PGP Fingerprint: B28D 2726 E2BC A97E 3854 5ABE 9A9F 00BC D525 16EE > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAG6CVpU4cMjxydUUMkgGG6km4P9Mn5S07Zp-%2BBrfUwEx9bg8yQ>