From owner-svn-src-head@FreeBSD.ORG Tue Mar 9 19:33:42 2010 Return-Path: Delivered-To: svn-src-head@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EB219106564A; Tue, 9 Mar 2010 19:33:42 +0000 (UTC) (envelope-from ache@nagual.pp.ru) Received: from nagual.pp.ru (nagual.pp.ru [194.87.13.69]) by mx1.freebsd.org (Postfix) with ESMTP id 2D1F58FC0C; Tue, 9 Mar 2010 19:33:41 +0000 (UTC) Received: from nagual.pp.ru (ache@localhost [127.0.0.1]) by nagual.pp.ru (8.14.3/8.14.3) with ESMTP id o29JXeGH014771; Tue, 9 Mar 2010 22:33:40 +0300 (MSK) (envelope-from ache@nagual.pp.ru) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nagual.pp.ru; s=default; t=1268163220; bh=xPnTC+8hdHiHny9r5PKuZCXTNkNM3G9hb1owDhfu2ow=; l=878; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:In-Reply-To; b=FohUk0bcjjGElPYSbvrYzAdPg41cGY6JVxZOMLz8NA03Xed3kLUsCRsASORUUIA/w S/IYSktwu225ajdNXT28pRYN6/bUYnIyJmiE7+t9SsNFtWSAeWjJ/8sFQ7o19dwiem XhnFLjKIDaGmw9x1iv/LmGF99GpRpzfCXLQFrYyE= Received: (from ache@localhost) by nagual.pp.ru (8.14.3/8.14.3/Submit) id o29JXe1T014770; Tue, 9 Mar 2010 22:33:40 +0300 (MSK) (envelope-from ache) Date: Tue, 9 Mar 2010 22:33:40 +0300 From: Andrey Chernov To: Bruce Evans , Jaakko Heinonen , src-committers@FreeBSD.ORG, svn-src-all@FreeBSD.ORG, svn-src-head@FreeBSD.ORG Message-ID: <20100309193339.GA14612@nagual.pp.ru> Mail-Followup-To: Andrey Chernov , Bruce Evans , Jaakko Heinonen , src-committers@FreeBSD.ORG, svn-src-all@FreeBSD.ORG, svn-src-head@FreeBSD.ORG References: <201003061921.o26JLv36014114@svn.freebsd.org> <20100307104626.GA9015@a91-153-117-195.elisa-laajakaista.fi> <20100308015926.O11669@delplex.bde.org> <20100307183139.GA50243@nagual.pp.ru> <20100307201027.GA51623@nagual.pp.ru> <20100308195123.GA10624@zim.MIT.EDU> <20100308202919.GA67990@nagual.pp.ru> <20100309175544.GA17698@zim.MIT.EDU> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100309175544.GA17698@zim.MIT.EDU> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: Subject: Re: svn commit: r204803 - head/usr.bin/uniq X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Mar 2010 19:33:43 -0000 On Tue, Mar 09, 2010 at 12:55:44PM -0500, David Schultz wrote: > Actually, a question...why doesn't it suffice to simply call > strcoll() instead of mbstowcs() followed by wcscoll()? > I would expect that in the absence of the -i flag, none of > this would be necessary. strcoll() is only for single-byte characters locale. It means no UTF-8 f.e. To do what you assume (without coverting to wide chars), we'll need fast mbscoll() function (see our join.c for its slow emulation using wide chars). > At the very least, it would make > sense to start with a strcmp(), and only fall back on the > expensive conversion and collation if the strings don't > compare equal. As I notice, files feeded to uniq commonly have only few equal lines and much more unequal ones, so strcmp() will be additional overkill most of the time. -- http://ache.pp.ru/