From owner-freebsd-current@FreeBSD.ORG Sun Aug 15 01:12:37 2010 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58B4410656A3 for ; Sun, 15 Aug 2010 01:12:37 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx21.fluidhosting.com [204.14.89.4]) by mx1.freebsd.org (Postfix) with ESMTP id C02D78FC16 for ; Sun, 15 Aug 2010 01:12:36 +0000 (UTC) Received: (qmail 24313 invoked by uid 399); 15 Aug 2010 01:12:36 -0000 Received: from localhost (HELO lap.dougb.net) (dougb@dougbarton.us@127.0.0.1) by localhost with ESMTPAM; 15 Aug 2010 01:12:36 -0000 X-Originating-IP: 127.0.0.1 X-Sender: dougb@dougbarton.us Message-ID: <4C673F02.8000805@FreeBSD.org> Date: Sat, 14 Aug 2010 18:12:34 -0700 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.8) Gecko/20100807 Thunderbird/3.1.2 MIME-Version: 1.0 To: Gabor Kovesdan References: <4C6505A4.9060203@FreeBSD.org> <20100813085235.GA16268@freebsd.org> <4C66C010.3040308@FreeBSD.org> In-Reply-To: <4C66C010.3040308@FreeBSD.org> X-Enigmail-Version: 1.1.2 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: delphij@FreeBSD.org, core@FreeBSD.org, current@FreeBSD.org Subject: Re: Official request: Please make GNU grep the default X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Aug 2010 01:12:37 -0000 On 08/14/2010 09:10, Gabor Kovesdan wrote: > Em 2010.08.13. 10:52, Roman Divacky escreveu: >> what about optimizing BSD grep instead? >> > [... picking one mail from the many that suggest this ...] ... and responding to your message for the same reason ... :) [Snipping the bit about why it's a hard problem not likely to be solved in the next few weeks.] > If you can make suggestions to make BSD grep faster without touching the > regex library please do it and if we can get a performance that is > acceptable, we can reconsider leaving it the default if nobody objects. > I'll check Sean's suggestions and make some measures how much does that > help. As I posted to you privately, the results I got with JUST Sean's patch on the test case I posted previously were: GNU grep Elapsed time: 2 seconds BSD grep Elapsed time: 31 seconds With the more complete patch you provided me privately I was able to shave one more second off the BSD grep case. So that's a lot better than the 47 seconds it was previously, but still a long way to go. I also have a new test case script which actually IS something that portmaster does, and in fact is the ugliest and most difficult search that it has to perform, finding an installed port based on grep'ing +CONTENTS files for an ORIGIN pattern: http://people.freebsd.org/~dougb/grep-time-trial-2.sh.txt Typical times for me, with 489 ports: GNU grep Elapsed time: 3 seconds BSD grep Elapsed time: 17 seconds (And before anyone bothers to reply saying "Use pkg_info -O for that" I'll save you the trouble. My version is from 10-20% faster. Not sure why, don't really care.) :) For those whose line of reasoning was, "But this is -current, so it's ok for things to be screwed up" my response is, only to a point. In the real world, people who don't care about performance and/or don't use grep in interesting and imaginative ways aren't going to mind BSD grep as the default, but also don't provide really useful test cases. "It works fine up to the 80'th percentile" has already been demonstrated by various pointyhat runs, etc. Sophisticated users who DO care about performance and/or DO use grep in interesting and creative ways will put up with the breakage for a while, then switch their make.conf to use GNU grep, usually silently. Therefore they stop providing ANY test data at all, never mind useful. However, given the very small number of people who actually test -current in the first place, the population I am really concerned about is the group of people who casually try -current, see that "It's really slow sometimes," don't/can't figure out why, and then get discouraged and just stop using -current at all. Now you might reply, "Great! Good riddance to those dilettantes!" However I believe rather strongly that we want to make the -current environment MORE friendly to users, even casual users. Who do you think is actually going to test "What will become 9.0-RELEASE" if we don't? OTOH, leaving it in, but switching the default gives those who are highly motivated to test and/or improve it a very easy way to do so, without causing problems for anyone else. It also makes it that much easier to make it the default again when it IS ready for prime time. Meanwhile, in response to everyone else, a simple question. How many TIMES (not percentages, multiples) slower is it Ok for BSD grep to be in comparison to GNU grep and stay the default? Doug -- Improve the effectiveness of your Internet presence with a domain name makeover! http://SupersetSolutions.com/ Computers are useless. They can only give you answers. -- Pablo Picasso