Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Jun 2002 16:03:18 +0400
From:      "Andrey A. Chernov" <ache@nagual.pp.ru>
To:        "Tim J. Robbins" <tjr@FreeBSD.org>
Cc:        cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/usr.bin/uniq uniq.c
Message-ID:  <20020606120315.GB87781@nagual.pp.ru>
In-Reply-To: <20020606202942.A45282@treetop.robbins.dropbear.id.au>
References:  <200206060313.g563DAi26751@freefall.freebsd.org> <20020606031545.GA83612@nagual.pp.ru> <20020606161843.A44561@treetop.robbins.dropbear.id.au> <20020606083246.GA85860@nagual.pp.ru> <20020606192402.A45186@treetop.robbins.dropbear.id.au> <20020606100352.GA86621@nagual.pp.ru> <20020606202942.A45282@treetop.robbins.dropbear.id.au>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 06, 2002 at 20:29:42 +1000, Tim J. Robbins wrote:
> 
> strcoll() should not indicate that these strings are identical. If it does,
> it is incorrectly implemented. FreeBSD's strcoll() and strxfrm() are
> incorrectly implemented: strcoll("ss", "\xdf") == 0 in some locales on FreeBSD,
> but equals 1, -1 or -108 on all Solaris locales.

It seems you don't understand collate at all. Please read POSIX collate
description at least before making any absurd statements.  strcoll()
indicates that this strings are identical _only_ if directly instructed
by collating table to do so.  For FreeBSD collate source format I mean 
this directive:

substitute <ss> with "ss"

It means that <ss> replaced with "ss" before any comparison happens.

> strcoll() is the correct function to use to compare sorting order of text
> strings.

And if strings have exact the same sorting order, they are equal.

> uniq is not interested in the sort order of strings, it is interested in
> whether two lines of text are identical. If the sort utility is operating

You treat uniq as binary compare, but it is only subset of its 
functionality. It can be used for real life languages too, in localized 
model.

> This behaviour is simply not correct, and the bug lies in FreeBSD's old
> uniq implementation, not GNU sort.

There is another bug in GNU sort in that area because of some limitations
of GNU sort l10n (in some cases strings are compared
character-by-character).  Don't make it worse by breaking uniq too.

> I shall not back this change out.

As person responsible for FreeBSD locale, I insist on backing it out. 

I already state my reasons. Please not write me until you complete reading
of POSIX collating model description, I have not time to educate people
again and again.

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020606120315.GB87781>