Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Apr 2002 20:40:13 +1000
From:      "Tim J. Robbins" <tim@robbins.dropbear.id.au>
To:        freebsd-current@FreeBSD.ORG
Cc:        "Andrey A. Chernov" <ache@nagual.pp.ru>
Subject:   Re: NetBSD sort l10n: I give up!
Message-ID:  <20020407204013.A20639@treetop.robbins.dropbear.id.au>

next in thread | raw e-mail | index | archive | help
Here is a patch to make NetBSD's sort(1) sort by the locale's collating
order. The table should not be called ascii[] anymore, but I can't think of
a better one, and supplying a patch to change the name would be pointless.

It works. It assumes the string strxfrm() outputs is the same length as
its input, which is always possible, and true on FreeBSD.

$ env LC_COLLATE=fr_FR.ISO8859-1 sort <test.fr | rs
Čte   elle
$ env LC_COLLATE=fr_FR.ISO8859-1 ./sort <test.fr | rs
Čte   elle
$ rs <test.fr
elle  Čte

Enjoy (?)


Tim


Index: init.c
===================================================================
RCS file: /home/ncvs/src/contrib/sort/init.c,v
retrieving revision 1.2
diff -u -r1.2 init.c
--- init.c	2002/04/07 00:49:00	1.2
+++ init.c	2002/04/07 10:29:59
@@ -46,6 +46,7 @@
 #endif /* not lint */
 
 #include <ctype.h>
+#include <err.h>
 #include <string.h>
 
 static void insertcol __P((struct field *));
@@ -291,8 +292,7 @@
  * Note: when sorting in forward order, to encode character zero in a key,
  * use \001\001; character 1 becomes \001\002.  In this case, character 0
  * is reserved for the field delimiter.  Analagously for -r (fld_d = 255).
- * Note: this is only good for ASCII sorting.  For different LC 's,
- * all bets are off.  See also num_init in number.c
+ * See also num_init in number.c
  */
 void
 settables(gflags)
@@ -300,8 +300,20 @@
 {
 	u_char *wts;
 	int i, incr;
+	static int warned;
+	char abuf[2], xbuf[8];
+
+	abuf[1] = '\0';
 	for (i=0; i < 256; i++) {
-		ascii[i] = i;
+		if (i != 0) {
+			*abuf = i;
+			if (strxfrm(xbuf, abuf, sizeof(xbuf)) > 1 && !warned) {
+				warnx("collating order too complicated");
+				warned = 1;
+			}
+			ascii[i] = *xbuf;
+		} else
+			ascii[i] = 0;
 		if (i > REC_D && i < 255 - REC_D+1)
 			Rascii[i] = 255 - i + 1;
 		else

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020407204013.A20639>