From owner-freebsd-hackers Fri Jul 30 19: 9:24 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from smtp13.bellglobal.com (smtp13.bellglobal.com [204.101.251.52]) by hub.freebsd.org (Postfix) with ESMTP id 9DFFA14E21 for ; Fri, 30 Jul 1999 19:09:20 -0700 (PDT) (envelope-from vanderh@ecf.toronto.edu) Received: from localhost.nowhere (Hamilton-ppp44819.sympatico.ca [206.172.76.12]) by smtp13.bellglobal.com (8.8.5/8.8.5) with ESMTP id WAA00614; Fri, 30 Jul 1999 22:08:33 -0400 (EDT) Received: (from tim@localhost) by localhost.nowhere (8.9.3/8.9.1) id WAA69506; Fri, 30 Jul 1999 22:07:26 -0400 (EDT) (envelope-from tim) Date: Fri, 30 Jul 1999 22:07:26 -0400 From: Tim Vanderhoek To: Dag-Erling Smorgrav Cc: John-Mark Gurney , James Howard , "Daniel C. Sobral" , freebsd-hackers@FreeBSD.ORG Subject: Re: replacing grep(1) Message-ID: <19990730220726.A69246@mad> References: <19990729182229.E24296@mad> <19990729164533.36798@hydrogen.fircrest.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.95i In-Reply-To: ; from Dag-Erling Smorgrav on Fri, Jul 30, 1999 at 03:27:20PM +0200 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Fri, Jul 30, 1999 at 03:27:20PM +0200, Dag-Erling Smorgrav wrote: > > Funnily, I experience a near-doubling of running time with similar > patches. Incidentally, it seems that it's not possible to assume that our regex library is even anywhere in the same league as the GNU regex library. b$ time ./grep -E '(vt100)|(printer)' longfile > /dev/null real 0m21.284s user 0m22.034s sys 0m0.083s Now, with a profiled executable with optimization turned off it takes about 25 seconds. Regardless, it appears to spend 98% of its time in regexec(), which is good, since that's where it should be spending time. [I had been intending to combine multiple patterns, ultimately combining in a '\n' to avoid the memchr() in mmopen]. b$ time grep '(vt100)|(printer)' longfile > /dev/null real 0m0.267s user 0m0.109s sys 0m0.157s 98% * 20 = ~19... Without an improved regex library, any mildly complicated pattern will bring the new grep to its knees. This could be the dfa helping GNU grep more than having a better regexp library... Probably both. I wonder how well the devel/pcre port would do POSIX regular expressions. -- This is my .signature which gets appended to the end of my messages. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message