Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Nov 2009 08:12:20 +0000
From:      Alex Burke <alexjeffburke@gmail.com>
To:        rea-fbsd@codelabs.ru
Cc:        freebsd-hackers@freebsd.org, Mel Flynn <mel.flynn+fbsd.hackers@mailing.thruhere.net>
Subject:   Re: Issue with grep -i (on i386 only?)
Message-ID:  <a8b8bb510911040012v360d0416o83866a0478805ff9@mail.gmail.com>
In-Reply-To: <T7DtLCOP0cwiekv/7ybsxY3l5dQ@7ANLw7WpNQUEViOFvqmcIRbmcl4>
References:  <200911032122.28905.mel.flynn%2Bfbsd.hackers@mailing.thruhere.net> <T7DtLCOP0cwiekv/7ybsxY3l5dQ@7ANLw7WpNQUEViOFvqmcIRbmcl4>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday, November 4, 2009, Eygene Ryabinkin <rea-fbsd@codelabs.ru> wro=
te:
> Mel, good day.
>
> Tue, Nov 03, 2009 at 09:22:28PM +0100, Mel Flynn wrote:
>> So on the laptop I modified the testscript as it is attached now and
>> while there is still a significant delay, the wallclock time is less
>> then half, when the expression is rewritten with the same meaning:
>> =3D>>> 16777216
>> =C2=A0 =C2=A0 =3D>>> fgrep
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.04 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.03 u=
ser =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.00 sys
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.05 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.03 u=
ser =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.01 sys
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.02 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.00 u=
ser =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.00 sys
>> =C2=A0 =C2=A0 =3D>>> pcregrep
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.26 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.21 u=
ser =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.02 sys
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.26 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.22 u=
ser =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.02 sys
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.44 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.35 u=
ser =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.01 sys
>> =C2=A0 =C2=A0 =3D>>> grep
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.04 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.04 u=
ser =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.00 sys
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 4.45 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 4.15 u=
ser =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.01 sys
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 2.00 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 1.81 u=
ser =C2=A0 =C2=A0 =C2=A0 =C2=A0 0.00 sys <-- [fF][Oo][Oo]
>
> Just did a quick test on the 8.0-RC2/i386 with very old Athlon processor:
> -----
> =3D>>> 16777216
>  =C2=A0 =C2=A0=3D>>> fgrep
>  =C2=A0 =C2=A0 =C2=A0 =C2=A00,09 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,04 us=
er =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,05 sys
>  =C2=A0 =C2=A0 =C2=A0 =C2=A00,18 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,06 us=
er =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,03 sys
>  =C2=A0 =C2=A0 =C2=A0 =C2=A00,05 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,01 us=
er =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,04 sys
>  =C2=A0 =C2=A0=3D>>> pcregrep
>  =C2=A0 =C2=A0 =C2=A0 =C2=A00,47 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,29 us=
er =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,07 sys
>  =C2=A0 =C2=A0 =C2=A0 =C2=A00,52 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,33 us=
er =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,07 sys
>  =C2=A0 =C2=A0 =C2=A0 =C2=A00,77 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,45 us=
er =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,03 sys
>  =C2=A0 =C2=A0=3D>>> grep
>  =C2=A0 =C2=A0 =C2=A0 =C2=A00,09 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,08 us=
er =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,01 sys
>  =C2=A0 =C2=A0 =C2=A0 =C2=A00,10 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,04 us=
er =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,05 sys
>  =C2=A0 =C2=A0 =C2=A0 =C2=A00,23 real =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,12 us=
er =C2=A0 =C2=A0 =C2=A0 =C2=A0 0,03 sys
> -----
> Pattern for the plain 'grep' is stable: first and second variants always
> give the same time within a 0.01 second variation and the last variant
> gives 2x slowdown.
>
> I tried sizes up to the 64M -- the pattern stays. =C2=A0The same stuff fo=
r
> the amd64, so in my case I don't see the difference in behaviour. =C2=A0S=
o,
> maybe, the problem isn't 32 vs 64 but lies somewhere else.
>
>> attached a little test script for grep's -i performance.
>
> Some notes about the script, especially if (or some variant of it)
> will be included to the testing framework.
>
>> #!/bin/sh
>> # vim: ts=3D4 sw=3D4 noet tw=3D78 ai
>>
>> PCREGREP=3D`which pcregrep`
>> BSDGREP=3D`which bsdgrep`
>> [ -n ${PCREGREP} ] && PCREGREP=3D`basename ${PCREGREP}`
>> [ -n ${BSDGREP} ] && BSDGREP=3D`basename ${BSDGREP}`
>
> You'll want '[ -n "${PCREGREP}" ] && ...' (with quoted variable) to
> really achieve the kind of test you wanted.
>
>> if [ ! -x /usr/bin/jot ]; then
>> =C2=A0 =C2=A0 =C2=A0 echo "Need jot"
>> =C2=A0 =C2=A0 =C2=A0 exit 1
>> fi
>> if [ ! -x /usr/bin/rs ]; then
>> =C2=A0 =C2=A0 =C2=A0 echo "Need rs"
>> =C2=A0 =C2=A0 =C2=A0 exit 1
>> fi
>
> Probably this is better be written as
> -----
> for prog in jot rs; do
>  =C2=A0 =C2=A0 =C2=A0 =C2=A0if [ -z "`which "$prog"`" ]; then
>  =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0echo "Need $prog"
>  =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0exit 1
>  =C2=A0 =C2=A0 =C2=A0 =C2=A0fi
> done
> -----
> because the latter code uses unqualified 'jot' and 'rs'.
>
>> for b in ${BYTES}; do
>> =C2=A0 =C2=A0 =C2=A0 TMPFILE=3D`mktemp -t ${me}`
>> =C2=A0 =C2=A0 =C2=A0 if [ ! -f ${TMPFILE} ]; then
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 echo Can\'t create tmp =
files in ${TMPDIR:=3D"/tmp"}
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 exit 2
>> =C2=A0 =C2=A0 =C2=A0 fi
>> =C2=A0 =C2=A0 =C2=A0 jot -r -c ${b} a z |rs -g 0 20 > ${TMPFILE}
>> =C2=A0 =C2=A0 =C2=A0 echo "=3D>>> ${b}"
>> =C2=A0 =C2=A0 =C2=A0 for prog in fgrep ${PCREGREP} ${BSDGREP} grep ; do
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 echo " =C2=A0 =C2=A0=3D=
>>> ${prog}"
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /usr/bin/time ${prog} f=
oo ${TMPFILE} >/dev/null
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /usr/bin/time ${prog} -=
i foo ${TMPFILE} >/dev/null
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /usr/bin/time ${prog} '=
[fF][Oo][Oo]' ${TMPFILE} >/dev/null
>> =C2=A0 =C2=A0 =C2=A0 done
>> =C2=A0 =C2=A0 =C2=A0 rm ${TMPFILE}
>> done
>
> Most likely, it is better to create the temporary file only once
> and to trap the signals with the file removal -- this will handle
> the cases when user presses ^C during the execution -- temporary
> file should be cleaned in this case. =C2=A0The code is simple:
> -----
> TMPFILE=3D`mktemp -t ${me}`
> if [ ! -f ${TMPFILE} ]; then
>  =C2=A0 =C2=A0 =C2=A0 =C2=A0echo "Can't create tmp file in ${TMPDIR:=3D/t=
mp}"
>  =C2=A0 =C2=A0 =C2=A0 =C2=A0exit 2
> fi
> trap 'rm -f "${TMPFILE}"' 0 1 2 3 15
> -----
>
> Attaching modified version with the bonus -- 'K' and 'M' size prefixes:
> it was boring to specify many digits when I had played with sizes ;))
> --
> Eygene
> =C2=A0_ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0___ =C2=A0=
 =C2=A0 =C2=A0 _.--. =C2=A0 #
> =C2=A0\`.|\..----...-'` =C2=A0 `-._.-'_.-'` =C2=A0 # =C2=A0Remember that =
it is hard
> =C2=A0/ =C2=A0' ` =C2=A0 =C2=A0 =C2=A0 =C2=A0 , =C2=A0 =C2=A0 =C2=A0 __.-=
-' =C2=A0 =C2=A0 =C2=A0# =C2=A0to read the on-line manual
> =C2=A0)/' _/ =C2=A0 =C2=A0 \ =C2=A0 `-_, =C2=A0 / =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0# =C2=A0while single-stepping the kernel.
> =C2=A0`-'" `"\_ =C2=A0,_.-;_.-\_ ', =C2=A0fsc/as =C2=A0 #
>  =C2=A0 =C2=A0 _.-'_./ =C2=A0 {_.' =C2=A0 ; / =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 # =C2=A0 =C2=A0-- FreeBSD Developers handbook
>  =C2=A0 =C2=A0{_.-``-' =C2=A0 =C2=A0 =C2=A0 =C2=A0 {_/ =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0#
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a8b8bb510911040012v360d0416o83866a0478805ff9>