Date: Thu, 29 Oct 2015 11:31:33 +0100 From: Matthias Apitz <guru@unixarea.de> To: freebsd-questions@freebsd.org Subject: tr(1) and LANG=de_DE.UTF-8 Message-ID: <20151029103133.GA16882@sh4-5.1blu.de>
next in thread | raw e-mail | index | archive | help
Hello, I was wondering why I could not patch a byte \357 in a file with tr(1): [guru@kant-r269739 ~]$ od -c /tmp/x 0000000 n o n U T F - 8 \n n o n U T 0000020 F - 8 \n v a l i d U T F - 8 \n 0000040 H e l l o W o r l d ! \n v a 0000060 l i d U T F - 8 \n H e l l o 0000100 357 277 277 W o r l d ! \n 0000113 [guru@kant-r269739 ~]$ LANG=de_DE.UTF-8 tr '\357' '\000' < /tmp/x | od -c 0000000 n o n U T F - 8 \n n o n U T 0000020 F - 8 \n v a l i d U T F - 8 \n 0000040 H e l l o W o r l d ! \n v a 0000060 l i d U T F - 8 \n H e l l o 0000100 357 277 277 W o r l d ! \n 0000113 until I changed the LANG to C: [guru@kant-r269739 ~]$ LANG=C tr '\357' '\000' < /tmp/x | od -c 0000000 n o n U T F - 8 \n n o n U T 0000020 F - 8 \n v a l i d U T F - 8 \n 0000040 H e l l o W o r l d ! \n v a 0000060 l i d U T F - 8 \n H e l l o 0000100 \0 277 277 W o r l d ! \n 0000113 I know that the man page of tr(1) contains a hint about the LANG and environment(7), but would not expect that this means that I can't change a single byte, octal given value, only for the reason that \357 is not a valid Unicode code point. Any ideas/comments on this? Thanks matthias -- Matthias Apitz | /"\ ASCII Ribbon Campaign: E-mail: guru@unixarea.de | \ / - No HTML/RTF in E-mail WWW: http://www.unixarea.de/ | X - No proprietary attachments phone: +49-176-38902045 | / \ - Respect for open standards | en.wikipedia.org/wiki/ASCII_Ribbon_Campaign
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151029103133.GA16882>