From owner-freebsd-ports-bugs@FreeBSD.ORG Sun Mar 28 06:20:24 2004 Return-Path: Delivered-To: freebsd-ports-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 83A7E16A4D8 for ; Sun, 28 Mar 2004 06:20:24 -0800 (PST) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 791A343D31 for ; Sun, 28 Mar 2004 06:20:24 -0800 (PST) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) i2SEKLbv034629 for ; Sun, 28 Mar 2004 06:20:21 -0800 (PST) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.10/8.12.10/Submit) id i2SEKLTc034627; Sun, 28 Mar 2004 06:20:21 -0800 (PST) (envelope-from gnats) Resent-Date: Sun, 28 Mar 2004 06:20:21 -0800 (PST) Resent-Message-Id: <200403281420.i2SEKLTc034627@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-ports-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Jean-Baptiste Quenot Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4BE0516A4CE; Sun, 28 Mar 2004 06:19:49 -0800 (PST) Received: from postfix4-2.free.fr (postfix4-2.free.fr [213.228.0.176]) by mx1.FreeBSD.org (Postfix) with ESMTP id E2E4B43D1D; Sun, 28 Mar 2004 06:19:48 -0800 (PST) (envelope-from jbq@caraldi.com) Received: from caraldi.com (toulouse-2-62-147-67-94.dial.proxad.net [62.147.67.94]) by postfix4-2.free.fr (Postfix) with ESMTP id EE4BC8CFB6; Sun, 28 Mar 2004 16:19:43 +0200 (CEST) Received: by caraldi.com (Postfix, from userid 1001) id C538ABD; Sun, 28 Mar 2004 16:19:41 +0200 (CEST) Message-Id: <20040328141941.C538ABD@caraldi.com> Date: Sun, 28 Mar 2004 16:19:41 +0200 (CEST) From: Jean-Baptiste Quenot To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 cc: ports@FreeBSD.org cc: mark@grondar.za Subject: ports/64845: Par must exclude non-breaking space from the class of space chars X-BeenThere: freebsd-ports-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Ports bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Mar 2004 14:20:24 -0000 >Number: 64845 >Category: ports >Synopsis: Par must exclude non-breaking space from the class of space chars >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-ports-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: update >Submitter-Id: current-users >Arrival-Date: Sun Mar 28 06:20:21 PST 2004 >Closed-Date: >Last-Modified: >Originator: Jean-Baptiste Quenot >Release: FreeBSD 5.1-CURRENT i386 >Organization: >Environment: System: FreeBSD watt.intra.caraldi.com 5.1-CURRENT FreeBSD 5.1-CURRENT #6: Tue Oct 14 19:03:28 CEST 2003 jbq@watt.intra.caraldi.com:/usr/obj/usr/src/sys/WATT i386 >Description: Par 1.52 on FreeBSD does not work as expected by the upstreams author. On FreeBSD, the isspace() system call returns true for the non-breaking space character 0xA0, but this is an unintended side effect. Quoting a message from the upstreams author: -------------------------------------------------------------------------------- From: "Adam M. Costello" Date: Tue, 2 Dec 2003 21:19:10 +0000 To: Jean-Baptiste Quenot User-Agent: Mutt/1.5.4i > on FreeBSD, the locales definitions include non-breaking space in the > list of spaces, thus isspace(160) is true, and as a result all my > nbsps are filtered out, and lines are broken on them. > > I noticed that the GNU libc has removed 0xA0 from spaces on purpose. > But the BSD guys seem to have another approach, as this kind of stuff > is "implementation specific". That's interesting. This was not an issue in Par 1.51, because it didn't call setlocale(), so only ASCII characters were recognized by isspace(), isalnum(), islower(), etc. In par 1.52, a call to setlocale() was added so that non-ASCII letters and digits would be recognized for the purpose of the g,B,P,Q options. An unforseen side effect is that non-ASCII white-space characters are now recognized. -------------------------------------------------------------------------------- Here is the fragment declaring SPACE and BLANK for the ISO Latin 1 locale on FreeBSD: /* * Standard LOCALE_CTYPE for the ISO 8859-1 Locale * * $FreeBSD: src/share/mklocale/la_LN.ISO8859-1.src,v 1.3 2001/11/30 05:05:53 ache Exp $ */ ... SPACE 0x09 - 0x0d ' ' 0xa0 UPPER 'A' - 'Z' 0xc0 - 0xd6 0xd8 - 0xde XDIGIT '0' - '9' 'a' - 'f' 'A' - 'F' BLANK ' ' '\t' 0xa0 >How-To-Repeat: Set your locale settings to an 8 bit character set like ISO8859-1. Insert non-breaking spaces in a text, and notice how par converts them to spaces, and even wrapping the lines on them. >Fix: Apply the following patch: -------------------------------------------------------------------------------- --- par.c.orig Sun Mar 28 16:00:15 2004 +++ par.c Sun Mar 28 16:04:00 2004 @@ -403,7 +403,8 @@ } continue; } - if (isspace(c)) ch = ' '; + // Exclude non-breaking space from the class of space chars + if (isspace(c) && c != 0xA0) ch = ' '; else blank = 0; additem(cbuf, &ch, errmsg); if (*errmsg) goto rlcleanup; -------------------------------------------------------------------------------- Thanks in advance, -- Jean-Baptiste Quenot http://caraldi.com/jbq/ >Release-Note: >Audit-Trail: >Unformatted: