From owner-freebsd-python@FreeBSD.ORG Thu Jan 29 16:20:14 2015 Return-Path: Delivered-To: freebsd-python@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 425EA287; Thu, 29 Jan 2015 16:20:14 +0000 (UTC) Received: from fmailer.gwdg.de (fmailer.gwdg.de [134.76.11.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F0010171; Thu, 29 Jan 2015 16:20:13 +0000 (UTC) Received: from um-excht-a02.um.gwdg.de ([134.76.11.222] helo=email.gwdg.de) by mailer.gwdg.de with esmtp (Exim 4.80) (envelope-from ) id 1YGrZ0-0003kB-O1; Thu, 29 Jan 2015 17:03:26 +0100 Received: from krabat.raven.hur (84.186.202.55) by email.gwdg.de (134.76.9.211) with Microsoft SMTP Server (TLS) id 14.3.195.1; Thu, 29 Jan 2015 17:03:26 +0100 Message-ID: <54CA59C9.4090002@gwdg.de> Date: Thu, 29 Jan 2015 17:03:21 +0100 From: Rainer Hurling User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: Unicode Problem References: <54C9FE33.2070307@FreeBSD.org> <20150129095328.GQ42409@kib.kiev.ua> In-Reply-To: <20150129095328.GQ42409@kib.kiev.ua> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 8bit X-Spam-Level: - X-Virus-Scanned: (clean) by clamav Cc: Kubilay Kocak , Robert Simmons , freebsd-python@freebsd.org X-BeenThere: freebsd-python@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: FreeBSD-specific Python issues List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jan 2015 16:20:14 -0000 Am 29.01.2015 um 10:53 schrieb Konstantin Belousov: > On Thu, Jan 29, 2015 at 08:32:35PM +1100, Kubilay Kocak wrote: >> On 29/01/2015 6:13 PM, Robert Simmons wrote: >>> On further inspection I've found the following: >>> >>> FreeBSD >>>>>> import sys >>>>>> print(sys.getdefaultencoding()) >>> utf-8 >>>>>> print(sys.stdout.encoding) >>> US-ASCII >>> >>> MacOS X: >>>>>> import sys >>>>>> print(sys.getdefaultencoding()) >>> utf-8 >>>>>> print(sys.stdout.encoding) >>> UTF-8 >>> >>> How do I modify stdout encoding to set it to UTF-8 in FreeBSD? >> >> Another data point from my 9-STABLE: >> >> Python 3.4.2 (default, Nov 3 2014, 13:38:18) >> [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final >> 208032)] on freebsd9 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> b'\xc3\xa2'.decode('utf-8') >> '??' >>>>> import sys >>>>> print(sys.getdefaultencoding()) >> utf-8 >>>>> print(sys.stdout.encoding) >> UTF-8 >>>>> >> >> Python 2.7.9 (default, Jan 24 2015, 20:39:40) >> [GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 (tags/RELEASE_34/dot1-final >> 208032)] on freebsd9 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> b'\xc3\xa2'.decode('utf-8') >> u'\xe2' >>>>> import sys >>>>> print(sys.getdefaultencoding()) >> ascii >>>>> print(sys.stdout.encoding) >> UTF-8 >>>>> On my box, with recent HEAD amd64, it is like for Koobs: #locale LANG=de_DE.UTF-8 LC_CTYPE="de_DE.UTF-8" LC_COLLATE=C LC_TIME="de_DE.UTF-8" LC_NUMERIC="de_DE.UTF-8" LC_MONETARY="de_DE.UTF-8" LC_MESSAGES="de_DE.UTF-8" LC_ALL= #python3 Python 3.4.2 (default, Jan 11 2015, 07:51:41) [GCC 4.2.1 Compatible FreeBSD Clang 3.5.0 (tags/RELEASE_350/final 216957)] on freebsd11 Type "help", "copyright", "credits" or "license" for more information. ### b'\xc3\xa2'.decode('utf-8') 'â' [For python 27, option UCS4 enabled] #python Python 2.7.9 (default, Jan 24 2015, 10:35:50) [GCC 4.2.1 Compatible FreeBSD Clang 3.5.1 (tags/RELEASE_351/final 225668)] on freebsd11 Type "help", "copyright", "credits" or "license" for more information. ### b'\xc3\xa2'.decode('utf-8') u'\xe2' So, obviously there is a difference between the python versions, independently from the locale settings?