From owner-freebsd-hackers Sun Oct 15 03:54:27 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id DAA21022 for hackers-outgoing; Sun, 15 Oct 1995 03:54:27 -0700 Received: from expo.x.org (expo.x.org [198.112.45.11]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id DAA21017 for ; Sun, 15 Oct 1995 03:54:25 -0700 Received: from exalt.x.org by expo.x.org id AA07462; Sun, 15 Oct 95 06:53:53 -0400 Received: from localhost by exalt.x.org id GAA03600; Sun, 15 Oct 1995 06:53:52 -0400 Message-Id: <199510151053.GAA03600@exalt.x.org> To: hackers@freefall.FreeBSD.org Subject: A couple problems in FreeBSD 2.1.0-950922-SNAP Date: Sun, 15 Oct 1995 06:53:52 EST From: "Kaleb S. KEITHLEY" Sender: owner-hackers@FreeBSD.org Precedence: bulk Sorry if this shows up twice. The one I sent last night hasn't come back yet, don't know if it got lost in the ether. 1) % man -k rune EUC(4) - EUC encoding of runes UTF2(4) - Universal character set Transformation Format encoding of runes mbrune(3), mbrrune(3), mbmb(3) - multibyte rune support for C setrunelocale(3), setinvalidrune(3), sgetrune(3), sputrune(3) - rune support for C %man setrunelocale No manual entry for setrunelocale 2) If I create a file that has extended ASCII (ISO8859-1) characters in the name, ls always substitues a '?' for the non-ASCII characters. Note that ls on, e.g. SVR4, does not do this I looked at the source for ls and I see that the conversion occurs when -q is specified (the default in any event). The FreeBSD ls man page says: -q Force printing of non-graphic characters in file names as the character `?'; this is the default when output is to a terminal. and just for reference the SVR4 man page says: -q Force printing of non-printable characters in file names as the character question mark (?). All multibyte characters are considered printable. So I think the test isprint in ls really ought to be isgraph instead. But just fixing ls isn't enough. The default table of character types in libc/locale/table.c isn't populated well enought to handle the whole ISO8859-1 character set. The following patch fixes ls, libc, and also fixes some bugs in mklocale's lt_LN LC_CTYPE template. *** bin/ls/util.c.orig Sat Oct 14 15:55:03 1995 --- bin/ls/util.c Sat Oct 14 17:02:00 1995 *************** *** 33,39 **** * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ! * $Id: util.c,v 1.4 1994/10/09 15:25:23 ache Exp $ */ #ifndef lint --- 33,39 ---- * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * ! * $Id: util.c,v 1.4 1994/10/09 15:25:23 ache Exp mumble$ */ #ifndef lint *************** *** 61,67 **** while (len--) { ch = *src++; ! *dest++ = isprint(ch) ? ch : '?'; } } --- 61,67 ---- while (len--) { ch = *src++; ! *dest++ = isgraph(ch) ? ch : '?'; } } *** lib/libc/locale/table.c.orig Sat Oct 14 16:24:22 1995 --- lib/libc/locale/table.c Sat Oct 14 17:00:40 1995 *************** *** 35,41 **** */ #if defined(LIBC_SCCS) && !defined(lint) ! static char sccsid[] = "@(#)table.c 8.1 (Berkeley) 6/27/93"; #endif /* LIBC_SCCS and not lint */ #include --- 35,41 ---- */ #if defined(LIBC_SCCS) && !defined(lint) ! static char sccsid[] = "@(#)table.c 8.1 (Berkeley) 6/27/93"; mumble #endif /* LIBC_SCCS and not lint */ #include *************** *** 86,91 **** --- 86,123 ---- _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, /*78*/ _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _P|_R|_G, _P|_R|_G, _P|_R|_G, _P|_R|_G, _C, + /*80*/ _C, _C, _C, _C, + _C, _C, _C, _C, + /*88*/ _C, _C, _C, _C, + _C, _C, _C, _C, + /*90*/ _C, _C, _C, _C, + _C, _C, _C, _C, + /*98*/ _C, _C, _C, _C, + _C, _C, _C, _C, + /*a0*/ _S, _P|_R|_G, _P|_R|_G, _P|_R|_G, + _P|_R|_G, _P|_R|_G, _P|_R|_G, _P|_R|_G, + /*a8*/ _P|_R|_G, _P|_R|_G, _P|_R|_G, _P|_R|_G, + _P|_R|_G, _P|_R|_G, _P|_R|_G, _P|_R|_G, + /*b0*/ _P|_R|_G, _P|_R|_G, _P|_R|_G, _P|_R|_G, + _P|_R|_G, _P|_R|_G, _P|_R|_G, _P|_R|_G, + /*b8*/ _P|_R|_G, _P|_R|_G, _P|_R|_G, _P|_R|_G, + _P|_R|_G, _P|_R|_G, _P|_R|_G, _P|_R|_G, + /*c0*/ _U|_R|_G, _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, + _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, + /*c8*/ _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, + _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, + /*d0*/ _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, + _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, _P|_R|_G, + /*d8*/ _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, + _U|_R|_G|_A, _U|_R|_G|_A, _U|_R|_G|_A, _L|_R|_G|_A, + /*e0*/ _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, + _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, + /*e8*/ _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, + _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, + /*f0*/ _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, + _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _P|_R|_G, + /*f8*/ _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, + _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A, _L|_R|_G|_A }, { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, *** usr.bin/mklocale/data/lt_LN.ISO8859-1.orig Sat Oct 14 16:00:43 1995 --- usr.bin/mklocale/data/lt_LN.ISO8859-1 Sat Oct 14 16:15:43 1995 *************** *** 11,22 **** CONTROL 0x00 - 0x1f 0x7f - 0x9f DIGIT '0' - '9' GRAPH 0x21 - 0x7e 0xa0 - 0xff ! LOWER 'a' - 'z' 0xe0 - 0xff ! PUNCT 0x21 - 0x2f 0x3a - 0x40 0x5b - 0x60 0x7b - 0x7e 0xa1 - 0xbf SPACE 0x09 - 0x0d 0x20 0xa0 ! UPPER 'A' - 'Z' 0xc0 - 0xde ! XDIGIT 'a' - 'f' 'A' - 'F' ! BLANK ' ' '\t' 0xa0 PRINT 0x20 - 0x7e 0xa0 - 0xff # IDEOGRAM # SPECIAL --- 11,23 ---- CONTROL 0x00 - 0x1f 0x7f - 0x9f DIGIT '0' - '9' GRAPH 0x21 - 0x7e 0xa0 - 0xff ! LOWER 'a' - 'z' 0xdf - 0xf6 0xf8 - 0xff ! PUNCT 0x21 - 0x2f 0x3a - 0x40 0x5b - 0x60 0x7b - 0x7e 0xa1 - 0xbf 0xd7 0xf7 SPACE 0x09 - 0x0d 0x20 0xa0 ! UPPER 'A' - 'Z' 0xc0 - 0xd6 0xd8 - 0xde ! XDIGIT '0' - '9' 'a' - 'f' 'A' - 'F' ! # only one true blank in 8859-1 ! BLANK 0x20 PRINT 0x20 - 0x7e 0xa0 - 0xff # IDEOGRAM # SPECIAL *************** *** 24,35 **** MAPLOWER <'A' - 'Z' : 'a'> MAPLOWER <'a' - 'z' : 'a'> ! MAPLOWER <0xc0 - 0xdd : 0xe0> ! MAPLOWER <0xe0 - 0xff : 0xe0> MAPUPPER <'A' - 'Z' : 'A'> MAPUPPER <'a' - 'z' : 'A'> ! MAPUPPER <0xc0 - 0xdd : 0xc0> ! MAPUPPER <0xe0 - 0xff : 0xc0> TODIGIT <'0' - '9' : 0> TODIGIT <'A' - 'F' : 10> TODIGIT <'a' - 'f' : 10> --- 25,41 ---- MAPLOWER <'A' - 'Z' : 'a'> MAPLOWER <'a' - 'z' : 'a'> ! MAPLOWER <0xc0 - 0xd6 : 0xe0> ! MAPLOWER <0xd8 - 0xde : 0xe0> ! MAPLOWER <0xdf - 0xf6 : 0xe0> ! MAPLOWER <0xf8 - 0xfe : 0xe0> ! MAPUPPER <'A' - 'Z' : 'A'> MAPUPPER <'a' - 'z' : 'A'> ! MAPUPPER <0xc0 - 0xd6 : 0xc0> ! MAPUPPER <0xd8 - 0xde : 0xc0> ! MAPUPPER <0xdf - 0xf6 : 0xc0> ! MAPUPPER <0xf8 - 0xfe : 0xc0> TODIGIT <'0' - '9' : 0> TODIGIT <'A' - 'F' : 10> TODIGIT <'a' - 'f' : 10>