Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Aug 2006 14:05:25 -0500
From:      "Rick C. Petty" <rick-freebsd@kiwi-computer.com>
To:        Mike Meyer <mwm@mired.org>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: A handy utility (at least for me)
Message-ID:  <20060828190525.GA35217@megan.kiwi-computer.com>
In-Reply-To: <17651.13192.130964.315409@bhuda.mired.org>
References:  <20060826225350.GA20172@megan.kiwi-computer.com> <200608281618.k7SGIwWh065261@lurza.secnetix.de> <20060828164218.GA34151@megan.kiwi-computer.com> <17651.13192.130964.315409@bhuda.mired.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Aug 28, 2006 at 02:18:48PM -0400, Mike Meyer wrote:
> 
> If echo is a shell built in, then it works just fine, and the xargs
> insures that you don't try passing to many arguments to rm.

Ah!  I was mistaken;  I didn't think about builtins not requiring argument
passing.

> > Also I don't see how your example is any more efficient than find-- you're
> > just making the shell do the work instead of find.
> 
> Find will check *every file* in *every directory* to see if it's named
> "work" or not. The shell version won't make that test on the first two
> levels of directories; it just expands them.

Forgot about those as well.  I retract my previous suggestion, now in favor
of:

find /usr/ports -depth 3 -prune -o -type d -name work -prune -print -delete

This will prevent from going into the files directories.  This method
doesn't have the extra process overhead.

> And now you get into the question of what "efficient" means. Either
> process is going to spend most of it's time waiting on the disk. With
> the find, nothing else is happening while that's going on. With
> multiple processes, there's a possibility that one can be working
> while the other is waiting on the disk, so it might take more CPU time
> while taking less wall clock time. Which is more efficient?  [NB: This
> is grossly oversimplified, but you get the general idea.]

In general I would agree with you.  But in this case, either the shell is
doing a loop over readdir() and applying its glob internally or find is
doing the loop over readdir() and applying its glob via regexec(3).  In
either case, the CPU time should be relatively similar.  In the find case,
a syscall is applied whereas the shell spits this to the xargs process thru
a pipe, who has to malloc/memcpy the lines and start at least one other
process, which then applies the syscall.  To me, this sounds like a lot
more CPU time.

I'm not convinced find would take any longer wallclock time.  It would be
interesting to see some stats on both methods, with and without the
benefit of FreeBSD's filesystem caching mechanisms.

Regardless, the find command is certainly not faster to type for some
people, and really what's important is how much operator time is spent.
One nice thing about unix is that there's more than one way to skin a
cat(1), pardon the pun.  So use what you feel more comfortable using!

Three cheers for free unix,

-- Rick C. Petty



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060828190525.GA35217>