From owner-freebsd-questions@FreeBSD.ORG Mon Dec 1 16:31:12 2003 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0FAC516A4CE for ; Mon, 1 Dec 2003 16:31:12 -0800 (PST) Received: from mail.caraldi.com (caraldi.com [62.212.102.95]) by mx1.FreeBSD.org (Postfix) with ESMTP id 791C543FB1 for ; Mon, 1 Dec 2003 16:31:09 -0800 (PST) (envelope-from jbq@caraldi.com) Received: from caraldi.com (watt.intra.caraldi.com [192.168.100.101]) by mail.caraldi.com (Postfix) with SMTP id 99C34223A for ; Tue, 2 Dec 2003 01:31:07 +0100 (CET) Received: by caraldi.com (sSMTP sendmail emulation); Tue, 2 Dec 2003 01:31:07 +0100 Date: Tue, 2 Dec 2003 01:31:07 +0100 From: Jean-Baptiste Quenot To: questions@FreeBSD.org Message-ID: <20031202003105.GA11013@watt.intra.caraldi.com> Mail-Followup-To: questions@FreeBSD.org Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oyUTqETQ0mS9luUI" Content-Disposition: inline User-Agent: Mutt/1.5.5.1i Subject: Is non-breaking space a space? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Dec 2003 00:31:12 -0000 --oyUTqETQ0mS9luUI Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello, I'm wondering why the non-breaking space is considered as a space in the FreeBSD C library, whereas it is not in the GNU libc. Sorry for comparing the two, but as a result, Linux and FreeBSD are incompatible in the way they handle isspace(160). This *only* occurs when LC_CTYPE is given =AB=A0single C chars locales=A0=BB like en_US.ISO8859-1. In /usr/src/share/mklocale, the file la_LN.ISO8859-1.src for example contains a SPACE definition that includes the non-breaking space. It seems that it is so since the beginning of FreeBSD, but is there some reference, some standard that states whether NBSP is considered a space or not? BTW the =AB=A0official=A0=BB [1]sources for glibc ctype functions hav= e an interesting comment: static bool is_space (unsigned int ch) { /* Don't make U+00A0 a space. Non-breaking space means that all p= rograms should treat it like a punctuation character, not like a space= =2E */ Best regards, --=20 Jean-Baptiste Quenot http://caraldi.com/jbq/ [1] http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/localedata/gen-unicod= e-ctype.c?rev=3D1.4&content-type=3Dtext/x-cvsweb-markup&cvsroot=3Dglibc --oyUTqETQ0mS9luUI Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQE/y91J9xx3BCMc9gsRAnxnAJ9+0qWz9wQuPn36TShjTsHoFhqfCQCdF5sL 8dsyDAnigZG8h27DYbplW2Q= =DdYL -----END PGP SIGNATURE----- --oyUTqETQ0mS9luUI--