Date: Tue, 3 Nov 2009 21:22:28 +0100 From: Mel Flynn <mel.flynn+fbsd.hackers@mailing.thruhere.net> To: freebsd-hackers@freebsd.org Subject: Issue with grep -i (on i386 only?) Message-ID: <200911032122.28905.mel.flynn%2Bfbsd.hackers@mailing.thruhere.net>
next in thread | raw e-mail | index | archive | help
--Boundary-00=_EEJ8Ksjl4Ao5+OM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, attached a little test script for grep's -i performance. I tried a few different machines and the 64-bit 7.2 machine I could steal doesn't seem to be affected and out performs pcregrep. On i386 machines, grep -i is significantly slower: i386, 7.2-STABLE of Sep 8, load averages: 0.00, 0.02, 0.00, Mem: 336M Active, 442M Inact, 217M Wired, 38M Cache, 112M Buf, 198M Free dev.cpu.0.freq: 2992 (Intel P-IV HTT enabled) 16Meg file result: =>>> 16777216 =>>> fgrep 0.04 real 0.02 user 0.01 sys 0.04 real 0.03 user 0.01 sys =>>> pcregrep 0.21 real 0.19 user 0.02 sys 0.21 real 0.20 user 0.00 sys =>>> grep 0.04 real 0.02 user 0.01 sys << not -i 3.64 real 3.61 user 0.01 sys << -i i386, 8.0-RC1 FreeBSD 8.0-RC1 #15 r197337M, load averages: 1.61, 1.35, 1.12 Mem: 920M Active, 87M Inact, 215M Wired, 69M Cache, 112M Buf, 195M Free dev.cpu.0.freq: 1733 (Intel dual core laptop) 16Meg file result: =>>> 16777216 =>>> fgrep 0.04 real 0.02 user 0.01 sys 0.05 real 0.04 user 0.00 sys =>>> pcregrep 0.26 real 0.23 user 0.01 sys 0.29 real 0.24 user 0.00 sys =>>> grep 0.04 real 0.04 user 0.00 sys 4.73 real 4.15 user 0.01 sys amd64, 7.2-RELEASE-p4 #1 r198384M, load averages: 0.00, 0.00, 0.00 Mem: 115M Active, 182M Inact, 264M Wired, 101M Cache, 213M Buf, 1311M Free CPU: Dual-Core AMD Opteron(tm) Processor 2210 (1800.08-MHz K8-class CPU) 64Meg file result: =>>> 67108864 =>>> fgrep 0.18 real 0.13 user 0.04 sys 0.19 real 0.17 user 0.02 sys =>>> pcregrep 0.89 real 0.85 user 0.03 sys 0.98 real 0.92 user 0.06 sys =>>> grep 0.18 real 0.16 user 0.01 sys 0.19 real 0.16 user 0.03 sys So on the laptop I modified the testscript as it is attached now and while there is still a significant delay, the wallclock time is less then half, when the expression is rewritten with the same meaning: =>>> 16777216 =>>> fgrep 0.04 real 0.03 user 0.00 sys 0.05 real 0.03 user 0.01 sys 0.02 real 0.00 user 0.00 sys =>>> pcregrep 0.26 real 0.21 user 0.02 sys 0.26 real 0.22 user 0.02 sys 0.44 real 0.35 user 0.01 sys =>>> grep 0.04 real 0.04 user 0.00 sys 4.45 real 4.15 user 0.01 sys 2.00 real 1.81 user 0.00 sys <-- [fF][Oo][Oo] So it looks to me that, while there is a problem with case insensitive comparison, just rewriting the expression is an optimization grep could perform. Either way, with the new text tools being written (done?) is this problem being attacked, not fixable due to specifications or not considered an issue? Any PR's needed / I missed? Patches to try? [And it just occured to me bsdgrep is in ports]: =>>> bsdgrep 0.93 real 0.74 user 0.00 sys 4.80 real 4.33 user 0.02 sys 4.97 real 4.34 user 0.01 sys So here the optimization does not fly. -- Mel --Boundary-00=_EEJ8Ksjl4Ao5+OM Content-Type: text/plain; charset="UTF-8"; name="test.sh.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="test.sh.txt" #!/bin/sh # vim: ts=4 sw=4 noet tw=78 ai PCREGREP=`which pcregrep` BSDGREP=`which bsdgrep` [ -n ${PCREGREP} ] && PCREGREP=`basename ${PCREGREP}` [ -n ${BSDGREP} ] && BSDGREP=`basename ${BSDGREP}` me=`basename $0` BYTES="1048576 2097152 4194304 8388608 16777216" if [ ! -x /usr/bin/jot ]; then echo "Need jot" exit 1 fi if [ ! -x /usr/bin/rs ]; then echo "Need rs" exit 1 fi for b in ${BYTES}; do TMPFILE=`mktemp -t ${me}` if [ ! -f ${TMPFILE} ]; then echo Can\'t create tmp files in ${TMPDIR:="/tmp"} exit 2 fi jot -r -c ${b} a z |rs -g 0 20 > ${TMPFILE} echo "=>>> ${b}" for prog in fgrep ${PCREGREP} ${BSDGREP} grep ; do echo " =>>> ${prog}" /usr/bin/time ${prog} foo ${TMPFILE} >/dev/null /usr/bin/time ${prog} -i foo ${TMPFILE} >/dev/null /usr/bin/time ${prog} '[fF][Oo][Oo]' ${TMPFILE} >/dev/null done rm ${TMPFILE} done --Boundary-00=_EEJ8Ksjl4Ao5+OM--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200911032122.28905.mel.flynn%2Bfbsd.hackers>