From owner-freebsd-i18n@FreeBSD.ORG Mon May 28 11:52:54 2007 Return-Path: X-Original-To: freebsd-i18n@freebsd.org Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 642AF16A46D for ; Mon, 28 May 2007 11:52:54 +0000 (UTC) (envelope-from ache@nagual.pp.ru) Received: from nagual.pp.ru (nagual.pp.ru [194.87.13.69]) by mx1.freebsd.org (Postfix) with ESMTP id D195213C447 for ; Mon, 28 May 2007 11:52:53 +0000 (UTC) (envelope-from ache@nagual.pp.ru) Received: from nagual.pp.ru (ache@localhost [127.0.0.1]) by nagual.pp.ru (8.14.1/8.14.1) with ESMTP id l4SBqqdC025021; Mon, 28 May 2007 15:52:52 +0400 (MSD) (envelope-from ache@nagual.pp.ru) Received: (from ache@localhost) by nagual.pp.ru (8.14.1/8.14.1/Submit) id l4SBqpXW025020; Mon, 28 May 2007 15:52:51 +0400 (MSD) (envelope-from ache) Date: Mon, 28 May 2007 15:52:50 +0400 From: Andrey Chernov To: Wolfgang Zenker Message-ID: <20070528115250.GA24812@nagual.pp.ru> Mail-Followup-To: Andrey Chernov , Wolfgang Zenker , freebsd-i18n@freebsd.org References: <200705272241.l4RMfg07051300@juno.lyxys.ka.sub.org> <20070528072847.GA18850@nagual.pp.ru> <20070528084659.GA77240@lyxys.ka.sub.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070528084659.GA77240@lyxys.ka.sub.org> User-Agent: Mutt/1.5.15 (2007-04-06) Cc: freebsd-i18n@freebsd.org Subject: Re: Why no non-latin TODIGIT mappings in UTF-8.src ? X-BeenThere: freebsd-i18n@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD Internationalization Effort List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 May 2007 11:52:54 -0000 On Mon, May 28, 2007 at 10:46:59AM +0200, Wolfgang Zenker wrote: > Looking at our UTF-8.src, I see > > $ grep DIGIT UTF-8.src > DIGIT '0' - '9' > XDIGIT '0' - '9' 'A' - 'F' 'a' - 'f' > TODIGIT < '0' - '9' : 0x0000 > > TODIGIT < 'A' - 'F' : 10 > < 'a' - 'f' : 10 > > > It appears to me that isdigit() behaviour is controlled by the DIGIT > keyword, not TODIGIT. However, I do admit that I don't understand completely > how locale files are supposed to work. So where does e.g. iswdigit() get > its character class information from, should that not be in the locale > information as well somewhere? There is no POSIX function to extract TODIGIT info, so it is useless for now. todigit() is SCO extension and its manpage says: The macro todigit returns the digit character corresponding to its integer argument. The argument must be in the range 0-9, otherwise the behavior is undefined. iswdigit() have the same 0-9 restriction as isdigit() just accepts wint_t -- http://ache.pp.ru/