From owner-freebsd-current@FreeBSD.ORG Fri Aug 13 11:35:43 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1B6710656A5 for ; Fri, 13 Aug 2010 11:35:43 +0000 (UTC) (envelope-from matthias.andree@gmx.de) Received: from mail.gmx.net (mailout-de.gmx.net [213.165.64.23]) by mx1.freebsd.org (Postfix) with SMTP id 207DD8FC14 for ; Fri, 13 Aug 2010 11:35:42 +0000 (UTC) Received: (qmail invoked by alias); 13 Aug 2010 11:09:02 -0000 Received: from baloo.cs.uni-paderborn.de (EHLO baloo.cs.uni-paderborn.de) [131.234.21.116] by mail.gmx.net (mp017) with SMTP; 13 Aug 2010 13:09:02 +0200 X-Authenticated: #428038 X-Provags-ID: V01U2FsdGVkX19an24IhZRaY+fA+/JEcW8PBmDbJ0LlAzGraszbcW Bt1WaemHDmDNx1 Received: from [127.0.0.1] (helo=balu.cs.uni-paderborn.de) by baloo.cs.uni-paderborn.de with esmtp (Exim 4.70) (envelope-from ) id L738B0-0006LW-A8 for freebsd-current@freebsd.org; Fri, 13 Aug 2010 13:09:00 +0200 Content-Type: text/plain; charset=iso-8859-15; format=flowed; delsp=yes To: freebsd-current@freebsd.org References: <4C6505A4.9060203@FreeBSD.org> <4C650B75.3020800@FreeBSD.org> Date: Fri, 13 Aug 2010 13:09:00 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: "Matthias Andree" Message-ID: In-Reply-To: <4C650B75.3020800@FreeBSD.org> User-Agent: Opera Mail/10.61 (Win32) X-Y-GMX-Trusted: 0 Subject: Re: Official request: Please make GNU grep the default X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Aug 2010 11:35:43 -0000 Gabor Kovesdan wrote on 2010-08-13: > Em 2010.08.13. 10:43, Doug Barton escreveu: >> My reason is simple, performance. While doing some portmaster work >> recently I was regression testing some changes I made to the --index* >> options and noticed that things were dramatically slower than the last >> time I tested those features. Thinking that I had made a programming >> mistake I dug into my code, and while the regexps that I was using could >> be tuned for slightly better performance the problem was not in my code. >> I then installed textproc/gnugrep to compare, and the differences were >> very dramatic using a highly pessimized test case (finding a match on >> the last line of INDEX). The script I used to test is at >> http://people.freebsd.org/~dougb/grep-time-trial.sh.txt and a typical >> result was: >> >> GNU grep >> Elapsed time: 2 seconds >> >> BSD grep >> Elapsed time: 47 seconds >> > Ok, I'll take care of this soon, and make GNU grep default, again with a > knob to build BSD grep. I agree with you that we cannot allow such a big > performance drawback but I my measures only showed significant > differences for very big searches and I didn't imagine that it could add > up to such a big diference. I'm sorry for the bad decision I took making > it default. Without knowing any of the details (I am not using 9-CURRENT), Gabor, I suggest that you check the documentation around Google's RE2 library (which is in C++); there are quite a few bits of information relating to (including worst-case) performance of regexp matchers, both directly in the re2 documentation, as well as indirect through links and references. Might be worth a read, together with profiling Doug's test case if he could tell you how to reproduce those. -- Matthias Andree