From owner-freebsd-current@FreeBSD.ORG Mon Jul 7 20:56:38 2008 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 84885106568B; Mon, 7 Jul 2008 20:56:38 +0000 (UTC) (envelope-from gabor@FreeBSD.org) Received: from viefep31-int.chello.at (viefep31-int.chello.at [62.179.121.49]) by mx1.freebsd.org (Postfix) with ESMTP id 4D1DC8FC18; Mon, 7 Jul 2008 20:56:36 +0000 (UTC) (envelope-from gabor@FreeBSD.org) Received: from [89.134.207.83] by viefep31-int.chello.at (InterMail vM.7.08.02.02 201-2186-121-104-20070414) with ESMTP id <20080707205635.XJZM23341.viefep31-int.chello.at@[89.134.207.83]>; Mon, 7 Jul 2008 22:56:35 +0200 Message-ID: <48728301.5070403@FreeBSD.org> Date: Mon, 07 Jul 2008 22:56:33 +0200 From: Gabor Kovesdan User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: Kris Kennaway References: <20080617004647.GA16546@nagual.pp.ru> <48576610.9080808@FreeBSD.org> <48577510.4020007@aueb.gr> <48577BD2.4070205@bluemedia.pl> <20080617102900.GA46479@nagual.pp.ru> <485798C4.2050605@FreeBSD.org> <20080618055851.GA85018@nagual.pp.ru> <86zlpjduew.fsf@ds4.des.no> <48598C6D.4040102@FreeBSD.org> <48727747.7070509@FreeBSD.org> <20080707201447.GA37354@nagual.pp.ru> <48727F14.6090507@FreeBSD.org> In-Reply-To: <48727F14.6090507@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Mailman-Approved-At: Mon, 07 Jul 2008 21:23:36 +0000 Cc: Maxim Sobolev , Doug Barton , current@FreeBSD.org, Andrey Chernov , Konrad Jankowski , Diomidis Spinellis , hackers@FreeBSD.org, Dag-Erling Sm?rgrav , "Sean C. Farley" , Max Khon Subject: Re: CFT: BSD-licensed grep [Fwd: cvs commit: ports/textproc/bsdgrep Makefile distinfo] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Jul 2008 20:56:38 -0000 Kris Kennaway escribió: > Andrey Chernov wrote: >> On Mon, Jul 07, 2008 at 10:06:31PM +0200, Kris Kennaway wrote: >>> What regression suites do other implementations have? e.g. the GNU >>> textutils. >> >> They basically have regex tests, but nothing locale specific, since >> locale ordering is different from platform to platform (until Unicode >> Collation Algorithm will win). >> > > OK. Well at least it is a start - passing those existing regression > tests should be a goal. Well, it seems you have missed the first nits of the discussion. GNU grep has some regression test, which doesn't pass completely itself either. :) I've mentioned here that I used those tests to find out what incompatible options are there. Unfortunately, I have to say that BSD grep won't pass all of those, because GNU allows some non-standard regexes, which are rejected by our libc-regex library, like for example (a|) is not standard because it has an empty subexpression. First, I tried to pre-edit such expression in the code. It was ugly enough but I thought: "Ok, this code is pretty ugly, but compatibility is important, maybe we can later revise and/or change our regexp library and get rid of these snippets." Later, when Andrey pointed it out, I realized that my workarounds adressed those incompatibilities but didn't work completely, they broke compatibility at other places, thus I just removed them, because it was not that easy to fix. The version that I sent you for the portbuild test, doesn't have those workarounds. The regression test helped though to fix other compatibility issues, like return values. All of these trivial things are supposed to be compatible now, the only exceptions are the non-standard regexes. That's why I'm so curious about the results. If they are inacceptable, we can try to build BSD grep with the GNU regexp lib (it's in the tree, as Pedro F. Giffuni pointed it out). It doesn't work by just linking with that library, so it will need more work and investigation then, not speaking about that GNU regex should go one day... Regards, Gábor