Skip site navigation (1)Skip section navigation (2)
Date:      Sat,  1 Sep 2012 00:12:28 +0300 (EEST)
From:      Vitaly Magerya <vmagerya@gmail.com>
To:        <freebsd-python@freebsd.org>
Subject:   Making Python's curses module work with Unicode
Message-ID:  <20120831211231.46BFA2E9E0@smtp.tx97.net>

next in thread | raw e-mail | index | archive | help
Hi, folks. I'm having a problem with Python curses module: it
won't work with unicode. Here's a test case:

    import curses
    import locale

    locale.setlocale(locale.LC_ALL, "")

    def run(win):
        win.addstr(u"\u03c0r\u00b2".encode("utf-8"))
        win.getch()

    curses.wrapper(run)

What I expect to see (and what I in fact see on e.g. Linux) is
"πr²" (that is, Greek "pi", Latin "r", superscript "2"), what I
actually get is "M-O~@rM-BM-2".

(Note that this is with LC_ALL set to "en_US.UTF-8" everywhere;
I'm seeing the same output in uxterm and putty, with and without
tmux inside of those).

I think this happens because python curses library is linked
with libncurses, not with libncursesw as it is on other platforms:

    $ ldd /usr/local/lib/python2.7/lib-dynload/_curses.so
    /usr/local/lib/python2.7/lib-dynload/_curses.so:
            libncurses.so.8 => /lib/libncurses.so.8 (0x801212000)
            libthr.so.3 => /lib/libthr.so.3 (0x80145f000)
            libc.so.7 => /lib/libc.so.7 (0x80084a000)

In fact, this appears to be intentional; here's a part of
lang/python27/files/patch-setup.py:

@@ -642,7 +642,7 @@
         # use the same library for the readline and curses modules.
         if 'curses' in readline_termcap_library:
             curses_library = readline_termcap_library
-        elif self.compiler.find_library_file(lib_dirs, 'ncursesw'):
+        elif self.compiler.find_library_file(lib_dirs, 'xxxncursesw'):
             curses_library = 'ncursesw'
         elif self.compiler.find_library_file(lib_dirs, 'ncurses'):
             curses_library = 'ncurses'
@@ -1246,12 +1248,13 @@
         # provided by the ncurses library.
         panel_library = 'panel'
         if curses_library.startswith('ncurses'):
-            if curses_library == 'ncursesw':
+            if curses_library == 'xxxncursesw':
                 # Bug 1464056: If _curses.so links with ncursesw,
                 # _curses_panel.so must link with panelw.
                 panel_library = 'panelw'

After some digging through commit logs, I found that this change
originated in [1] as a fix for ports/99496 [2].

Note that this patch is a no-op in 2.7 and in 3.2, so it can be
safely removed: a few lines above the first chunk of the patch
setup.py determines which curses library does readline use
(libncurses.so in FreeBSD), and uses that itself (unless readline
uses libtinfo, in which case python links with libncursesw).

More importantly though, if I remove all that logic and simply
set "curses_library = 'ncursesw'", thus making _curses.so link
with libncursesw.so, nothing breaks: python builds and runs fine,
and my example above works as expected too. Even if I have
devel/ncurses installed (as described in ports/99496), _curses.so
still links with libncursesw.so from base and everything seems
to work.

In short, I'm proposing a patch [3] against lang/python{27,32}
that removes all that logic and makes _curses.so link with
libncursesw. As you can see from redports logs at [4], it builds
fine on all releases (I also tested python27 with devel/ncurses
installed, which worked too).

Can I have a brave committer to either accept the patch (in which
case I'll work on the same for other python versions), or point
me to whatever problems it may trigger (in which case, I'll test
for them)?

[1] http://www.freebsd.org/cgi/cvsweb.cgi/ports/lang/python24/files/Attic/patch-setup.py#rev1.8
[2] http://www.freebsd.org/cgi/query-pr.cgi?pr=99496 
[3] http://redports.org/changeset/6571
[4] http://redports.org/buildarchive/20120831205350-31320/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120831211231.46BFA2E9E0>