From owner-freebsd-stable@FreeBSD.ORG Thu May 24 11:36:17 2007 Return-Path: X-Original-To: freebsd-stable@FreeBSD.ORG Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CAE6916A400 for ; Thu, 24 May 2007 11:36:17 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (lurza.secnetix.de [83.120.8.8]) by mx1.freebsd.org (Postfix) with ESMTP id 3496113C48C for ; Thu, 24 May 2007 11:36:16 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (klyvwf@localhost [127.0.0.1]) by lurza.secnetix.de (8.13.4/8.13.4) with ESMTP id l4OBaAWe092421; Thu, 24 May 2007 13:36:16 +0200 (CEST) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.13.4/8.13.1/Submit) id l4OBaA9P092420; Thu, 24 May 2007 13:36:10 +0200 (CEST) (envelope-from olli) Date: Thu, 24 May 2007 13:36:10 +0200 (CEST) Message-Id: <200705241136.l4OBaA9P092420@lurza.secnetix.de> From: Oliver Fromme To: freebsd-stable@FreeBSD.ORG, matrix@itlegion.ru In-Reply-To: <023401c79ded$ba3b7fe0$05000100@Artem> X-Newsgroups: list.freebsd-stable User-Agent: tin/1.8.2-20060425 ("Shillay") (UNIX) (FreeBSD/4.11-STABLE (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.1.2 (lurza.secnetix.de [127.0.0.1]); Thu, 24 May 2007 13:36:16 +0200 (CEST) Cc: Subject: Re: How to test locale in C? All my tests fail. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 May 2007 11:36:17 -0000 Artem Kuchin wrote: > I have a very stupid problem. I cannot convert to upper > or lower case using manually set locale (setlocale(..)). > > A very simple program: > [...] > printf("IS UPPER ?: %d\n",isupper('?')); > printf("IS UPPER ?: %d\n",isupper('?')); > printf("IS LOWER ?: %d\n",islower('?')); > printf("IS LOWER ?: %d\n",islower('?')); > printf("LOCALE %s\n",setlocale(LC_CTYPE,NULL)); > printf("1: TO UPPER %c TO LOWER %c\n",toupper('?'),tolower('?')); > printf("1-0: TO UPPER %c TO LOWER %c\n",toupper('?'),tolower('?')); > printf("2: TO UPPER %c TO LOWER %c\n",toupper('r'),tolower('R')); > [...] > IS UPPER ?: 0 > IS UPPER ?: 0 > IS LOWER ?: 0 > IS LOWER ?: 0 > LOCALE ru_RU.CP1251 > 1: TO UPPER ? TO LOWER ? > 1-0: TO UPPER ? TO LOWER ? > 2: TO UPPER R TO LOWER r That's a common pitfall. Chars are signed by default on FreeBSD, and the isupper() etc. function take an int type argument. That means that characters >= 128 end up as negative numbers, so they fail all isupper() and islower() checks, and toupper()/tolower() don't touch them at all. The solution is to typecast the constants to unsigned char explicitly, like this: isupper((unsigned char) '?') etc. Your program will work fine then. Best regards Oliver PS: You should also #include PPS: This is not a FreeBSD-specific pitfall. The ISO-C standard does not specify the signedness of chars, and most implementations (but not all) seem to prefer to have chars signed by default. So, in order to write portable programs, you always need to typecast if the difference between signed and unsigned matters in your application. PPPS: I think follow-ups should go to the -standards mailing list. -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead." -- RFC 1925