From owner-freebsd-hackers@freebsd.org Thu Feb 1 20:18:19 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0CC18EC3B90 for ; Thu, 1 Feb 2018 20:18:19 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: from mail-it0-f43.google.com (mail-it0-f43.google.com [209.85.214.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9BEC8836CE for ; Thu, 1 Feb 2018 20:18:18 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: by mail-it0-f43.google.com with SMTP id k131so5773927ith.4 for ; Thu, 01 Feb 2018 12:18:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:reply-to:in-reply-to:references :from:date:message-id:subject:to:cc:content-transfer-encoding; bh=ZpJHkkhl0Z5V4oBEJTtCtWJCXnrsrJ6LmEITJwhrZPc=; b=fPd9dSmG+/0Lrpf+CvW1taYp/C+qpV5FVjIeihK3/YliG5qOU0a5sMjRbiNkumQT6D XoeyZv6xCVmdEKMkI+BWnK4naYWBLpi2GBnbeJUrezKWYhsBon1A62i5a/z8wLor+68l z9gPgkLAfREnsYLOwinCqQJ7EiNHHkm9qedxNxp7x5OLatGkUANzfuAomsqYhmrbVkHF ipYrtsSJJN4vTgvebvo+ur+6h20H0mUnRyizEUGuY02xAFyOqXWX5sXlqI0i/5JEtGrZ gZ9HI8U9pdvNJrFC+54XtlOAUlixSJEKSGNbV5lBt9Y1RMkgWWLPcadAeCcBCX74Rr+O SLzQ== X-Gm-Message-State: AKwxytdDo1RpvGSwP0FVtfxeqNkAgCnYjulA7hyCN7Z3M51IzCzWc9BZ CoYbpn3OEzaa+Oc4DKbQDLwSxWIm X-Google-Smtp-Source: AH8x2257vc7gnz4QppDrCwMow0XBNchg+uFXVqHYDNutQxekoCgga1m54boBcdwYwDjkR+e13L4tzA== X-Received: by 10.36.215.134 with SMTP id y128mr42325826itg.114.1517516291801; Thu, 01 Feb 2018 12:18:11 -0800 (PST) Received: from mail-io0-f170.google.com (mail-io0-f170.google.com. [209.85.223.170]) by smtp.gmail.com with ESMTPSA id u64sm203683iod.15.2018.02.01.12.18.11 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 01 Feb 2018 12:18:11 -0800 (PST) Received: by mail-io0-f170.google.com with SMTP id z6so20544454iob.11 for ; Thu, 01 Feb 2018 12:18:11 -0800 (PST) X-Received: by 10.107.131.210 with SMTP id n79mr38673789ioi.215.1517516290882; Thu, 01 Feb 2018 12:18:10 -0800 (PST) MIME-Version: 1.0 Reply-To: cem@freebsd.org Received: by 10.2.95.152 with HTTP; Thu, 1 Feb 2018 12:18:10 -0800 (PST) In-Reply-To: References: <20180201072831.GA2239@c720-r314251> From: Conrad Meyer Date: Thu, 1 Feb 2018 12:18:10 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Printing UTF-8 characters To: Farhan Khan Cc: Matthias Apitz , "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Feb 2018 20:18:19 -0000 You've said a number of things about UTF-8 that appear to be mistaken. Start here: https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-= every-software-developer-absolutely-positively-must-know-about-unicode-and-= character-sets-no-excuses/ On Thu, Feb 1, 2018 at 7:42 AM, Farhan Khan wrote: > On Thu, Feb 1, 2018 at 2:28 AM, Matthias Apitz wrote: >> >> El d=C3=ADa jueves, febrero 01, 2018 a las 01:15:34a. m. -0500, Farhan K= han escribi=C3=B3: >> >> > Hi everyone, >> > >> > Is there a standard way to render historically non-printable UTF-8 >> > characters that will work across all terminals? I am trying to modify = a >> > standard FreeBSD utility that may occasionally work with characters in >> > other languages. On some terminals, specifically FreeBSD running in >> > VirtualBox, I see question-marks rather than the expected character. I >> > wonder if this is the proper way to display such non-printable charact= ers >> > or no? >> >> Not sure what you mean with 'historically non-printable UTF-8'. UTF-8 is >> an encoding form (one of more) to present Unicode Codepoints in bytes. I= f >> you want to "print" them to paper or PDF there are ways to write them >> with Postscript and with the correct font-support to bring them into >> human readable form. If you want to "display" these UTF-8 bytes you need >> a terminal-software with UTF-8 support, for example from the ports x11/r= xvt-unicode >> and the fonts for the Codepoint areas you want to display. >> >> Btw: Can you display my signature line correctly? There is an UTF-8 enco= ded >> Codepoint for a mobile telephone :-) >> >> matthias >> -- >> Matthias Apitz, =E2=9C=89 guru@unixarea.de, =E2=8C=82 http://www.unixare= a.de/ =F0=9F=93=B1 +49-176-38902045 >> Public GnuPG key: http://www.unixarea.de/key.pub >> > > Sorry, that was a poorly phrased question on my part. Let me try again. > I am trying to make text align in columns in a terminal. My > understanding is that characters above 0x7E are 3 bytes in length. A > modern terminal will render that as either a single question-mark or > the character itself, making terminal column alignment easy. But how > would an older terminal display a 3-byte character? I am worried that > would render as 3 question marks and throw off column alignment. If > so, is there a proper way to perform alignment for both newer and > older terminals? > > I am reading this email on Gmail's, so those characters properly > render for me :) > > Thanks, > > -- > Farhan Khan > PGP Fingerprint: B28D 2726 E2BC A97E 3854 5ABE 9A9F 00BC D525 16EE > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org= "