From owner-freebsd-hackers@FreeBSD.ORG Wed Jun 25 10:26:30 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 993A837B401 for ; Wed, 25 Jun 2003 10:26:30 -0700 (PDT) Received: from smtp.k12us.com (smtp.k12us.com [65.112.222.15]) by mx1.FreeBSD.org (Postfix) with SMTP id CB33543FA3 for ; Wed, 25 Jun 2003 10:26:29 -0700 (PDT) (envelope-from cweimann@smtp.k12us.com) Received: (qmail 91898 invoked by uid 1001); 25 Jun 2003 16:59:47 -0000 Resent-Message-ID: <20030625165947.91894.qmail@smtp.k12us.com> Date: Wed, 25 Jun 2003 12:28:37 -0400 From: Christopher Weimann To: freebsd-hackers@freebsd.org Message-ID: <20030625000344.A54424@smtp.k12us.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2.5i In-Reply-To: <20030621103502.K18572@thor.farley.org>; from sean-freebsd@farley.org on Sat, Jun 21, 2003 at 10:55:59AM -0500 Resent-From: Christopher Weimann Resent-Date: Wed, 25 Jun 2003 12:59:47 -0400 Resent-To: freebsd-hackers@freebsd.org X-AntiVirus: scanned for viruses by AMaViS 0.2.1 (http://amavis.org/) Subject: Re: Replacing GNU grep revisited X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jun 2003 17:26:30 -0000 On Sat 06/21/2003-10:55:59AM -0500, Sean Farley wrote: > > I have placed the patches up on Geocities¹ for others to try out. They > get freegrep fairly close to the performance of GNU's grep. Also > included is a small patch to regex to squeak a bit more performance out > of it, but I am not certain if it actually helps or not. > There is at least one aspect of freegrep that doesn't even come close to GNU grep, fgrep. I'll grant that this is a pretty extreeme example and perhaps not many people are making use of fgrep but... GNU grep %/usr/bin/time /usr/bin/fgrep -f /usr/share/dict/words /usr/share/games/fortune/fortunes2 > /dev/null 5.40 real 4.98 user 0.41 sys 42743 359388 2034033 freegrep %/usr/bin/time /usr/local/bin/fgrep -f /usr/share/dict/words /usr/share/games/fortune/fortunes2 > /dev/null 990.43 real 988.61 user 1.99 sys 42743 359388 2034033 I ran both of these more than once so it is not a fluke. After looking at it further it seems that freegrep does not use the Aho-Corasick algorithim for fgrep but just uses brute force. Just for giggles I downloaded the V7 fgrep from http://www.tuhs.org/Archive/PDP-11/Trees/V7/usr/src/cmd/fgrep.c to see what I'm guessing is Aho's version would do. %/usr/bin/time ./fgrep -f /usr/share/dict/words /usr/share/games/fortune/fortunes2 | wc 0.98 real 0.71 user 0.25 sys 42743 359388 2034033 Which pretty much squashes GNU grep. I wonder how many of the other old utils outrun our modern ones. I guess it must be all those gotos. :) -- ------------------------------------------------------------ Christopher Weimann http://www.k12usa.com K12USA.com Cool Tools for Schools! ------------------------------------------------------------