Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 May 2007 13:36:10 +0200 (CEST)
From:      Oliver Fromme <olli@lurza.secnetix.de>
To:        freebsd-stable@FreeBSD.ORG, matrix@itlegion.ru
Subject:   Re: How to test locale in C? All my tests fail.
Message-ID:  <200705241136.l4OBaA9P092420@lurza.secnetix.de>
In-Reply-To: <023401c79ded$ba3b7fe0$05000100@Artem>

next in thread | previous in thread | raw e-mail | index | archive | help
Artem Kuchin wrote:
 > I have a very stupid problem. I cannot convert to upper
 > or lower case using manually set locale (setlocale(..)).
 > 
 > A very simple program:
 > [...]
 >         printf("IS UPPER ?: %d\n",isupper('?'));
 >         printf("IS UPPER ?: %d\n",isupper('?'));
 >         printf("IS LOWER ?: %d\n",islower('?'));
 >         printf("IS LOWER ?: %d\n",islower('?'));
 >         printf("LOCALE %s\n",setlocale(LC_CTYPE,NULL));
 >         printf("1: TO UPPER %c TO LOWER %c\n",toupper('?'),tolower('?'));
 >         printf("1-0: TO UPPER %c TO LOWER %c\n",toupper('?'),tolower('?'));
 >         printf("2: TO UPPER %c TO LOWER %c\n",toupper('r'),tolower('R'));
 > [...]
 > IS UPPER ?: 0
 > IS UPPER ?: 0
 > IS LOWER ?: 0
 > IS LOWER ?: 0
 > LOCALE ru_RU.CP1251
 > 1: TO UPPER ? TO LOWER ?
 > 1-0: TO UPPER ? TO LOWER ?
 > 2: TO UPPER R TO LOWER r

That's a common pitfall.  Chars are signed by default on
FreeBSD, and the isupper() etc. function take an int type
argument.  That means that characters >= 128 end up as
negative numbers, so they fail all isupper() and islower()
checks, and toupper()/tolower() don't touch them at all.

The solution is to typecast the constants to unsigned char
explicitly, like this:  isupper((unsigned char) '?') etc.
Your program will work fine then.

Best regards
   Oliver

PS:  You should also #include <stdio.h>

PPS:  This is not a FreeBSD-specific pitfall.  The ISO-C
standard does not specify the signedness of chars, and
most implementations (but not all) seem to prefer to
have chars signed by default.  So, in order to write
portable programs, you always need to typecast if the
difference between signed and unsigned matters in your
application.

PPPS:  I think follow-ups should go to the -standards
mailing list.

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"With sufficient thrust, pigs fly just fine.  However, this
is not necessarily a good idea.  It is hard to be sure where
they are going to land, and it could be dangerous sitting
under them as they fly overhead." -- RFC 1925



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200705241136.l4OBaA9P092420>