Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Jun 2008 01:44:36 +1000 (EST)
From:      Ian Smith <smithi@nimnet.asn.au>
To:        Bill Campbell <freebsd@celestial.com>
Cc:        Jos Chrispijn <jos@webrz.net>, Wojciech Puchar <wojtek@wojtek.tensor.gdynia.pl>, Raphael Becker <rabe@uugrn.org>, freebsd-questions@freebsd.org
Subject:   Re: Grep Guru
Message-ID:  <Pine.BSF.3.96.1080610001213.15342A-100000@gaia.nimnet.asn.au>
In-Reply-To: <20080608231711.053E310656E1@hub.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 8 Jun 2008 16:07:12 -0700 Bill Campbell <freebsd@celestial.com> wrote:
 > On Mon, Jun 09, 2008, Raphael Becker wrote:
 > >On Sun, Jun 08, 2008 at 10:15:50PM +0200, Wojciech Puchar wrote:
 > >> find . -type f -print0|xargs -0 grep <grepoptions> <text to search>
 > >
 > >There's no more need for find | xargs
 > >
 > >Try: 
 > >
 > >find . -type -f -exec grep <grepoptions> <text to search> {} \+
 > >
 > >-exec foo {} \+ behaves like xargs foo  
 > >-exec foo {} \; exec foo for every file

Thanks for this kick; I'd missed or misunderstood using {} \+

 > The issue here is that grep execs grep for each file found while
 > xargs batches the files.

If find(1) is to be believed, so does -exec utility [argument ...] {} +

 > This is of particular importance if one wants to see the file
 > names in the output.  In relation to this, if one wants to be
 > sure that grep always generates the file name, insure that it
 > always gets at least two files as arguments:
 > 
 > find . -type f -print0 | xargs -0 grep pattern /dev/null

Another good clue.  Many ways to do anything; I've often used such as:

% find /sys/ -name "*.[chm]" -exec egrep -Hi 'CPUFREQ_[GS]ET' {} \;

which has grep print the filenames, rather than using -print with find,
but I've just now run the above find, then using \+ instead, twice each,
and am pleased to learn that the latter method runs ~4 times faster in
real time and is even lighter on the system:

% time find /sys/ -name "*.[chm]" -exec grep -Hi 'CPUFREQ_[GS]ET' {} \;
/sys/kern/kern_cpu.c:static int cpufreq_settings_sysctl(SYSCTL_HANDLER_ARGS);
  [.. etc ..]
20.524u 46.205s 4:03.91 27.3%   79+201k 5698+0io 0pf+0w

% time find /sys/ -name "*.[chm]" -exec grep -Hi 'CPUFREQ_[GS]ET' {} \+
1.756u 3.058s 1:07.51 7.1%      81+290k 7148+0io 13pf+0w

% time find /sys/ -name "*.[chm]" -exec grep -Hi 'CPUFREQ_[GS]ET' {} \;
21.742u 44.382s 3:57.99 27.7%   79+200k 7144+0io 0pf+0w

% time find /sys/ -name "*.[chm]" -exec grep -Hi 'CPUFREQ_[GS]ET' {} \+
1.651u 3.134s 0:58.39 8.1%      75+267k 7149+0io 10pf+0w

(Ignore sloth; poor 300MHz Celeron already busy dumping /usr over nfs :)

 > FWIW, I have learned about gnu-grep's -r option reading this
 > thread, which I had not noticed previously.  I guess that just
 > goes to show that old habits die hard :-).

When you're on a good thing :) but always plenty new tricks to learn.

cheers, Ian




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.1080610001213.15342A-100000>