From owner-svn-src-all@FreeBSD.ORG Wed Nov 5 23:38:50 2008 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9FC4B10656EA; Wed, 5 Nov 2008 23:38:50 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from harmony.bsdimp.com (bsdimp.com [199.45.160.85]) by mx1.freebsd.org (Postfix) with ESMTP id 51A718FC2E; Wed, 5 Nov 2008 23:38:50 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from localhost (localhost [127.0.0.1]) by harmony.bsdimp.com (8.14.2/8.14.1) with ESMTP id mA5NbwYN035984; Wed, 5 Nov 2008 16:37:58 -0700 (MST) (envelope-from imp@bsdimp.com) Date: Wed, 05 Nov 2008 16:39:11 -0700 (MST) Message-Id: <20081105.163911.420518480.imp@bsdimp.com> To: ivoras@gmail.com From: "M. Warner Losh" In-Reply-To: <9bbcef730811051526p3a978848uf904a149cb81fbce@mail.gmail.com> References: <200811051508.mA5F89XD030040@svn.freebsd.org> <20081105.150108.1649771743.imp@bsdimp.com> <9bbcef730811051526p3a978848uf904a149cb81fbce@mail.gmail.com> X-Mailer: Mew version 5.2 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, des@FreeBSD.org Subject: Re: svn commit: r184691 - head/sys/compat/linprocfs X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2008 23:38:50 -0000 In message: <9bbcef730811051526p3a978848uf904a149cb81fbce@mail.gmail.com> "Ivan Voras" writes: : 2008/11/5 M. Warner Losh : : > In message: <200811051508.mA5F89XD030040@svn.freebsd.org> : > Dag-Erling Smorgrav writes: : > : utf-8 : > : > Is there some reason to prefer utf-8 over the 8-bit iso character set : > we were using? : : Reason? You mean you actually *like* 8-bit code pages in the first place? :) Liked? Not necessarily. Understood: yes. Just didn't have a clue why the change. : As a person from a country that has during its history decided it : really needs 3-4 dots and dashes in its alphabet that make it (the : alphabet) not representable in ASCII, and who has had Many Fun Days : converting between various 8-bit code pages, ISO standard or not, and : especially with deducing which code page is actually being used as all : bytes are created equal (and Microsoft just *had* to tweak two letters : from iso8859-2 into Latin2), I welcome UTF-8 with a warm room, a beer, : peanuts and a backrub. Hmmmm. peanuts.... : UTF-8 (as opposed to old 8-bit code pages which need to die as soon as : possible and UTF-16 which got itself messed up with endianess) in : unambiguous. A sequence of proper UTF-8 bytes (and UTF-8 has a : structure so not every random collection of bytes with the 8th bit set : is proper UTF-8) can always be linked to the same letter. : : This is why there's such a big push to get systems to properly support : UTF-8. FreeBSD had a SoC project this year that was supposed to : properly implement Unicode collations (and thus collation of UTF-8 : strings) but it looks dead or in a dormant state right now (though I : didn't follow it attentively). That makes sense. Warner