From owner-freebsd-ports@FreeBSD.ORG Thu Jan 19 17:43:15 2012 Return-Path: Delivered-To: freebsd-ports@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BC07D1065675 for ; Thu, 19 Jan 2012 17:43:15 +0000 (UTC) (envelope-from rflynn@acsalaska.net) Received: from huffman.acsalaska.net (huffman.acsalaska.net [209.112.173.250]) by mx1.freebsd.org (Postfix) with ESMTP id 85F018FC12 for ; Thu, 19 Jan 2012 17:43:15 +0000 (UTC) Received: from mymail.acsalaska.net (sheep.acsalaska.net [216.67.61.194]) by huffman.acsalaska.net (8.14.4/8.14.4) with ESMTP id q0JHhEkG074165; Thu, 19 Jan 2012 08:43:14 -0900 (AKST) (envelope-from rflynn@acsalaska.net) Received: from 46.129.107.107 (SquirrelMail authenticated user rflynn@acsalaska.net) by mymail.acsalaska.net with HTTP; Thu, 19 Jan 2012 08:43:14 -0900 (AKST) Message-ID: <2920.46.129.107.107.1326994994.squirrel@mymail.acsalaska.net> In-Reply-To: References: Date: Thu, 19 Jan 2012 08:43:14 -0900 (AKST) From: rflynn@acsalaska.net To: Fernando =?iso-8859-1?Q?Apestegu=EDa?= User-Agent: SquirrelMail/1.4.13 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.3 (huffman.acsalaska.net [209.112.168.121]); Thu, 19 Jan 2012 08:43:14 -0900 (AKST) X-ACS-Spam-Status: no X-ACS-Scanned-By: MD 2.67; SA 3.3.0; spamdefang 1.122 Cc: freebsd-ports@freebsd.org Subject: Re: Encoding question X-BeenThere: freebsd-ports@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting software to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jan 2012 17:43:15 -0000 Hi, > I'm trying to compile a C++ software on FreeBSD. While compiling, this > error shows up: > > error: stray '\357' in program > error: stray '\273' in program > error: stray '\277' in program > > This file is reported (by file[1]) to be "UTF-8 Unicode (with BOM) C > program text, with CRLF line terminators" while the rest of the files > in the package are "ASCII C program text, with CRLF line terminators". > While I can convert the file with iconv -c -f utf-8 -t ascii file > > new_file in the post extract stage, I wonder if there is a more > suitable way for achieving the same thing. Also I would like to avoid > this software from depending on iconv. You have three options: - have it fixed upstream; - post process on extract like above; - post process releases and roll your own tarball which you host yourself. Fixing upstream is by far the best solution and here's some ammunition: The Unicode Standard does permit the BOM in UTF-8, but does not require or recommend its use. Byte order has no meaning in UTF-8 so in UTF-8 the BOM serves only to identify a text stream or file as UTF-8. [1] [1] http://en.wikipedia.org/wiki/Byte_order_mark -- Mel