From owner-freebsd-questions@freebsd.org Thu Oct 29 10:53:40 2015 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A69E9A201D9 for ; Thu, 29 Oct 2015 10:53:40 +0000 (UTC) (envelope-from ftp51246-2575596@sh4-5.1blu.de) Received: from sh4-5.1blu.de (sh4-5.1blu.de [178.254.11.41]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 729831EB2 for ; Thu, 29 Oct 2015 10:53:40 +0000 (UTC) (envelope-from ftp51246-2575596@sh4-5.1blu.de) Received: from ftp51246-2575596 by sh4-5.1blu.de with local (Exim 4.76) (envelope-from ) id 1ZrkUX-00052B-FT; Thu, 29 Oct 2015 11:31:33 +0100 Date: Thu, 29 Oct 2015 11:31:33 +0100 From: Matthias Apitz To: freebsd-questions@freebsd.org Subject: tr(1) and LANG=de_DE.UTF-8 Message-ID: <20151029103133.GA16882@sh4-5.1blu.de> Reply-To: Matthias Apitz MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Operating-System: FreeBSD 7.0-RELEASE (i386) User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Oct 2015 10:53:40 -0000 Hello, I was wondering why I could not patch a byte \357 in a file with tr(1): [guru@kant-r269739 ~]$ od -c /tmp/x 0000000 n o n U T F - 8 \n n o n U T 0000020 F - 8 \n v a l i d U T F - 8 \n 0000040 H e l l o W o r l d ! \n v a 0000060 l i d U T F - 8 \n H e l l o 0000100 357 277 277 W o r l d ! \n 0000113 [guru@kant-r269739 ~]$ LANG=de_DE.UTF-8 tr '\357' '\000' < /tmp/x | od -c 0000000 n o n U T F - 8 \n n o n U T 0000020 F - 8 \n v a l i d U T F - 8 \n 0000040 H e l l o W o r l d ! \n v a 0000060 l i d U T F - 8 \n H e l l o 0000100 357 277 277 W o r l d ! \n 0000113 until I changed the LANG to C: [guru@kant-r269739 ~]$ LANG=C tr '\357' '\000' < /tmp/x | od -c 0000000 n o n U T F - 8 \n n o n U T 0000020 F - 8 \n v a l i d U T F - 8 \n 0000040 H e l l o W o r l d ! \n v a 0000060 l i d U T F - 8 \n H e l l o 0000100 \0 277 277 W o r l d ! \n 0000113 I know that the man page of tr(1) contains a hint about the LANG and environment(7), but would not expect that this means that I can't change a single byte, octal given value, only for the reason that \357 is not a valid Unicode code point. Any ideas/comments on this? Thanks matthias -- Matthias Apitz | /"\ ASCII Ribbon Campaign: E-mail: guru@unixarea.de | \ / - No HTML/RTF in E-mail WWW: http://www.unixarea.de/ | X - No proprietary attachments phone: +49-176-38902045 | / \ - Respect for open standards | en.wikipedia.org/wiki/ASCII_Ribbon_Campaign