From owner-freebsd-hackers  Tue Jul 27 21:35:59 1999
Delivered-To: freebsd-hackers@freebsd.org
Received: from po4.wam.umd.edu (po4.wam.umd.edu [128.8.10.166])
	by hub.freebsd.org (Postfix) with ESMTP id BE14C14CF5
	for <freebsd-hackers@freebsd.org>; Tue, 27 Jul 1999 21:35:50 -0700 (PDT)
	(envelope-from howardjp@wam.umd.edu)
Received: from rac9.wam.umd.edu (root@rac9.wam.umd.edu [128.8.10.149])
	by po4.wam.umd.edu (8.9.3/8.9.3) with ESMTP id AAA26439
	for <freebsd-hackers@freebsd.org>; Wed, 28 Jul 1999 00:35:47 -0400 (EDT)
Received: from rac9.wam.umd.edu (sendmail@localhost [127.0.0.1])
	by rac9.wam.umd.edu (8.9.3/8.9.3) with SMTP id AAA21806
	for <freebsd-hackers@freebsd.org>; Wed, 28 Jul 1999 00:35:46 -0400 (EDT)
Received: from localhost by rac9.wam.umd.edu (8.9.3/8.9.3) with ESMTP id AAA21802
	for <freebsd-hackers@freebsd.org>; Wed, 28 Jul 1999 00:35:46 -0400 (EDT)
X-Authentication-Warning: rac9.wam.umd.edu: howardjp owned process doing -bs
Date: Wed, 28 Jul 1999 00:35:45 -0400 (EDT)
From: James Howard <howardjp@wam.umd.edu>
To: freebsd-hackers@freebsd.org
Subject: Re: replacing grep(1)
In-Reply-To: <xzplnc3b1lq.fsf@des.follo.net>
Message-ID: <Pine.GSO.4.10.9907272344440.19477-100000@rac9.wam.umd.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Due to the discussion of speed, I have been looking at it and it is really
slow.  Even slower than I thought and I was thinking it was pretty slow.

So using gprof, I have discovered that it seems to spend a whole mess of
time in grep_malloc() and free().  So I pulled all the references to
malloc inside the main loop (the copy for ln.dat and removed queueing).
This stills leaves us with a grep that is about ~6x slower than GNU.
Before that, it ran closer to 80x.  After this, gprof says it spends
around 53% of its time in procline().

Now, Tim Vanderhoek originally pointed this out and suggested that there
was a whole lot of unnecessary string copying and mallocing going on.
He was right.  The file name also gets copied back and forth and I don't
think this is required either.

Now, it seems to me that the queue could be sped up significantly using an
array-based implementation.  The queue is never going to be larger than
Bflag entries in size so a one time malloc at initqueue() would remove a
lot of the malloc/free combos from the program.  Good idea/bad idea?

The pointer returned from fgetln is valid only until the next IO call, so
perhaps the string as returned from fgetln should be passed directly to
procline() and only copied if -B has been given.  This will speed up grep
in the common case of no leading context needing to be printed.  Trailing
context is not affected by this.

Unfortunetly, this does nothing for zgrep, but I have not benchmarked
zgrep yet.  I don't even want to think about it, gzfgetln() reallocs on
every byte read.

One thing that concerns me is the lack of consistant timings.  Using time,
sometimes I get real times up to 100% apart on consecutive runs.  If I do
GNU grep, it is never off by more than 5% or so.  What could be causing
this?

Since it will still be slower than GNU, but quite a bit, at least 6x, does
anyone else see ways to cut some fat?

Jamie


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message