From owner-freebsd-hackers@FreeBSD.ORG Wed Nov 4 03:05:42 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E35A4106566B for ; Wed, 4 Nov 2009 03:05:42 +0000 (UTC) (envelope-from rea-fbsd@codelabs.ru) Received: from 0.mx.codelabs.ru (0.mx.codelabs.ru [144.206.177.45]) by mx1.freebsd.org (Postfix) with ESMTP id 7511F8FC1C for ; Wed, 4 Nov 2009 03:05:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codelabs.ru; s=two; h=Date:From:To:Cc:Subject:Message-ID: Reply-To:References:MIME-Version:Content-Type:In-Reply-To: Sender; bh=bhmq4U+yU6sQ9Dc1HBSV/sLZid0whwpxj4tsVN+2rMk=; b=lwi9b w33sbHupAD2EZbqm/dGYWC3kh5dzzp85mpMs/8K6mSGdxNbh8g9pnCSXJhL6jGRi lpkaJZkusTYtpesjegU7upwwZInQ4vZRRdSWUAXQKnnSSXUuyUqY602NbOSG8ScT 8z9pRftNus1un+IVwZ2aN+tYifGQD1xQA5waPpG7ldenUOP3/JYsIE06norxNu1T HBACi4PPo+YP15374RrSbpNFGJCa/1ExTKjNQeBLbTdT4sVcKvEkWlI92zH6laSn 3FtO+zGqmOcJboLkXYPX1AqUMQRsjKClIiHW9S+8ueMfBm3teSKx3Wck6Xi7twzD FcZGwBWQyd51dP3fw== Received: from amnesiac.at.no.dns (ppp83-237-104-5.pppoe.mtu-net.ru [83.237.104.5]) by 0.mx.codelabs.ru with esmtpsa (TLSv1:AES256-SHA:256) id 1N5WBo-000EiL-21; Wed, 04 Nov 2009 06:05:40 +0300 Date: Wed, 4 Nov 2009 06:05:44 +0300 From: Eygene Ryabinkin To: Mel Flynn Message-ID: References: <200911032122.28905.mel.flynn+fbsd.hackers@mailing.thruhere.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="/3yNEOqWowh/8j+e" Content-Disposition: inline In-Reply-To: <200911032122.28905.mel.flynn+fbsd.hackers@mailing.thruhere.net> Sender: rea-fbsd@codelabs.ru X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org Subject: Re: Issue with grep -i (on i386 only?) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: rea-fbsd@codelabs.ru List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Nov 2009 03:05:43 -0000 --/3yNEOqWowh/8j+e Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Mel, good day. Tue, Nov 03, 2009 at 09:22:28PM +0100, Mel Flynn wrote: > So on the laptop I modified the testscript as it is attached now and > while there is still a significant delay, the wallclock time is less > then half, when the expression is rewritten with the same meaning: > =>>> 16777216 > =>>> fgrep > 0.04 real 0.03 user 0.00 sys > 0.05 real 0.03 user 0.01 sys > 0.02 real 0.00 user 0.00 sys > =>>> pcregrep > 0.26 real 0.21 user 0.02 sys > 0.26 real 0.22 user 0.02 sys > 0.44 real 0.35 user 0.01 sys > =>>> grep > 0.04 real 0.04 user 0.00 sys > 4.45 real 4.15 user 0.01 sys > 2.00 real 1.81 user 0.00 sys <-- [fF][Oo][Oo] Just did a quick test on the 8.0-RC2/i386 with very old Athlon processor: ----- =>>> 16777216 =>>> fgrep 0,09 real 0,04 user 0,05 sys 0,18 real 0,06 user 0,03 sys 0,05 real 0,01 user 0,04 sys =>>> pcregrep 0,47 real 0,29 user 0,07 sys 0,52 real 0,33 user 0,07 sys 0,77 real 0,45 user 0,03 sys =>>> grep 0,09 real 0,08 user 0,01 sys 0,10 real 0,04 user 0,05 sys 0,23 real 0,12 user 0,03 sys ----- Pattern for the plain 'grep' is stable: first and second variants always give the same time within a 0.01 second variation and the last variant gives 2x slowdown. I tried sizes up to the 64M -- the pattern stays. The same stuff for the amd64, so in my case I don't see the difference in behaviour. So, maybe, the problem isn't 32 vs 64 but lies somewhere else. > attached a little test script for grep's -i performance. Some notes about the script, especially if (or some variant of it) will be included to the testing framework. > #!/bin/sh > # vim: ts=4 sw=4 noet tw=78 ai > > PCREGREP=`which pcregrep` > BSDGREP=`which bsdgrep` > [ -n ${PCREGREP} ] && PCREGREP=`basename ${PCREGREP}` > [ -n ${BSDGREP} ] && BSDGREP=`basename ${BSDGREP}` You'll want '[ -n "${PCREGREP}" ] && ...' (with quoted variable) to really achieve the kind of test you wanted. > if [ ! -x /usr/bin/jot ]; then > echo "Need jot" > exit 1 > fi > if [ ! -x /usr/bin/rs ]; then > echo "Need rs" > exit 1 > fi Probably this is better be written as ----- for prog in jot rs; do if [ -z "`which "$prog"`" ]; then echo "Need $prog" exit 1 fi done ----- because the latter code uses unqualified 'jot' and 'rs'. > for b in ${BYTES}; do > TMPFILE=`mktemp -t ${me}` > if [ ! -f ${TMPFILE} ]; then > echo Can\'t create tmp files in ${TMPDIR:="/tmp"} > exit 2 > fi > jot -r -c ${b} a z |rs -g 0 20 > ${TMPFILE} > echo "=>>> ${b}" > for prog in fgrep ${PCREGREP} ${BSDGREP} grep ; do > echo " =>>> ${prog}" > /usr/bin/time ${prog} foo ${TMPFILE} >/dev/null > /usr/bin/time ${prog} -i foo ${TMPFILE} >/dev/null > /usr/bin/time ${prog} '[fF][Oo][Oo]' ${TMPFILE} >/dev/null > done > rm ${TMPFILE} > done Most likely, it is better to create the temporary file only once and to trap the signals with the file removal -- this will handle the cases when user presses ^C during the execution -- temporary file should be cleaned in this case. The code is simple: ----- TMPFILE=`mktemp -t ${me}` if [ ! -f ${TMPFILE} ]; then echo "Can't create tmp file in ${TMPDIR:=/tmp}" exit 2 fi trap 'rm -f "${TMPFILE}"' 0 1 2 3 15 ----- Attaching modified version with the bonus -- 'K' and 'M' size prefixes: it was boring to specify many digits when I had played with sizes ;)) -- Eygene _ ___ _.--. # \`.|\..----...-'` `-._.-'_.-'` # Remember that it is hard / ' ` , __.--' # to read the on-line manual )/' _/ \ `-_, / # while single-stepping the kernel. `-'" `"\_ ,_.-;_.-\_ ', fsc/as # _.-'_./ {_.' ; / # -- FreeBSD Developers handbook {_.-``-' {_/ # --/3yNEOqWowh/8j+e--