Date: Thu, 6 Jun 2002 16:03:18 +0400 From: "Andrey A. Chernov" <ache@nagual.pp.ru> To: "Tim J. Robbins" <tjr@FreeBSD.org> Cc: cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/usr.bin/uniq uniq.c Message-ID: <20020606120315.GB87781@nagual.pp.ru> In-Reply-To: <20020606202942.A45282@treetop.robbins.dropbear.id.au> References: <200206060313.g563DAi26751@freefall.freebsd.org> <20020606031545.GA83612@nagual.pp.ru> <20020606161843.A44561@treetop.robbins.dropbear.id.au> <20020606083246.GA85860@nagual.pp.ru> <20020606192402.A45186@treetop.robbins.dropbear.id.au> <20020606100352.GA86621@nagual.pp.ru> <20020606202942.A45282@treetop.robbins.dropbear.id.au>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 06, 2002 at 20:29:42 +1000, Tim J. Robbins wrote: > > strcoll() should not indicate that these strings are identical. If it does, > it is incorrectly implemented. FreeBSD's strcoll() and strxfrm() are > incorrectly implemented: strcoll("ss", "\xdf") == 0 in some locales on FreeBSD, > but equals 1, -1 or -108 on all Solaris locales. It seems you don't understand collate at all. Please read POSIX collate description at least before making any absurd statements. strcoll() indicates that this strings are identical _only_ if directly instructed by collating table to do so. For FreeBSD collate source format I mean this directive: substitute <ss> with "ss" It means that <ss> replaced with "ss" before any comparison happens. > strcoll() is the correct function to use to compare sorting order of text > strings. And if strings have exact the same sorting order, they are equal. > uniq is not interested in the sort order of strings, it is interested in > whether two lines of text are identical. If the sort utility is operating You treat uniq as binary compare, but it is only subset of its functionality. It can be used for real life languages too, in localized model. > This behaviour is simply not correct, and the bug lies in FreeBSD's old > uniq implementation, not GNU sort. There is another bug in GNU sort in that area because of some limitations of GNU sort l10n (in some cases strings are compared character-by-character). Don't make it worse by breaking uniq too. > I shall not back this change out. As person responsible for FreeBSD locale, I insist on backing it out. I already state my reasons. Please not write me until you complete reading of POSIX collating model description, I have not time to educate people again and again. -- Andrey A. Chernov http://ache.pp.ru/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020606120315.GB87781>