Date: Mon, 28 Aug 2006 14:05:25 -0500 From: "Rick C. Petty" <rick-freebsd@kiwi-computer.com> To: Mike Meyer <mwm@mired.org> Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: A handy utility (at least for me) Message-ID: <20060828190525.GA35217@megan.kiwi-computer.com> In-Reply-To: <17651.13192.130964.315409@bhuda.mired.org> References: <20060826225350.GA20172@megan.kiwi-computer.com> <200608281618.k7SGIwWh065261@lurza.secnetix.de> <20060828164218.GA34151@megan.kiwi-computer.com> <17651.13192.130964.315409@bhuda.mired.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Aug 28, 2006 at 02:18:48PM -0400, Mike Meyer wrote: > > If echo is a shell built in, then it works just fine, and the xargs > insures that you don't try passing to many arguments to rm. Ah! I was mistaken; I didn't think about builtins not requiring argument passing. > > Also I don't see how your example is any more efficient than find-- you're > > just making the shell do the work instead of find. > > Find will check *every file* in *every directory* to see if it's named > "work" or not. The shell version won't make that test on the first two > levels of directories; it just expands them. Forgot about those as well. I retract my previous suggestion, now in favor of: find /usr/ports -depth 3 -prune -o -type d -name work -prune -print -delete This will prevent from going into the files directories. This method doesn't have the extra process overhead. > And now you get into the question of what "efficient" means. Either > process is going to spend most of it's time waiting on the disk. With > the find, nothing else is happening while that's going on. With > multiple processes, there's a possibility that one can be working > while the other is waiting on the disk, so it might take more CPU time > while taking less wall clock time. Which is more efficient? [NB: This > is grossly oversimplified, but you get the general idea.] In general I would agree with you. But in this case, either the shell is doing a loop over readdir() and applying its glob internally or find is doing the loop over readdir() and applying its glob via regexec(3). In either case, the CPU time should be relatively similar. In the find case, a syscall is applied whereas the shell spits this to the xargs process thru a pipe, who has to malloc/memcpy the lines and start at least one other process, which then applies the syscall. To me, this sounds like a lot more CPU time. I'm not convinced find would take any longer wallclock time. It would be interesting to see some stats on both methods, with and without the benefit of FreeBSD's filesystem caching mechanisms. Regardless, the find command is certainly not faster to type for some people, and really what's important is how much operator time is spent. One nice thing about unix is that there's more than one way to skin a cat(1), pardon the pun. So use what you feel more comfortable using! Three cheers for free unix, -- Rick C. Petty
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060828190525.GA35217>