Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 2 Jun 2013 18:25:10 +0100
From:      Chris Rees <crees@FreeBSD.org>
To:        florent+FreeBSD-hackers@peterschmitt.fr
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: sed query
Message-ID:  <CADLo83_EdAxBtGv5wM-ZmbTxxs66yLzAiW%2Bo6bmunY6q0DZ=wQ@mail.gmail.com>
In-Reply-To: <51AB7DCD.90104@peterschmitt.fr>
References:  <CADLo838JALaTwdSjy%2BV0JMHkbz1mD%2BezOq7a=dRzeNaSeUrDEg@mail.gmail.com> <20130602124127.6c3a847ea5ddb116a69d4814@yahoo.es> <CADLo838H6ikpO8mSh%2Bu6caZTRYfm56qMgq6%2Bzn2BJoVXZHSNdg@mail.gmail.com> <51AB7DCD.90104@peterschmitt.fr>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2 June 2013 18:15, Florent Peterschmitt <florent@peterschmitt.fr> wrote:
> Le 02/06/2013 14:16, Chris Rees a =E9crit :
>> On 2 June 2013 11:41, Eduardo Morras <emorrasg@yahoo.es> wrote:
>>> On Fri, 31 May 2013 15:01:59 +0100
>>> Chris Rees <utisoft@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I think I've discovered a strange behaviour of sed perhaps triggered
>>>> by the length of a regex passed to it.  I noticed that a certain
>>>> expression I passed took a very long time, and suspected the usual
>>>> backtracking loop, so I started trimming it... and discovered this:
>>>>
>>>> [crees@pegasus]~% time sed -ne "s,^BitchX-[0-9][^|]*[\|]/usr/por,,"
>>>> /var/db/pkg/INDEX-9
>>>> 4.699u 0.007s 0:04.70 99.7% 40+2733k 0+0io 0pf+0w
>>>> [crees@pegasus]~% time sed -ne "s,^BitchX-[0-9][^|]*[\|]/usr/po,,"
>>>> /var/db/pkg/INDEX-9
>>>> 0.042u 0.000s 0:00.04 100.0% 48+3216k 0+0io 0pf+0w
>>>>
>>>> I've looked at the code, and can't from a brief glance figure out why
>>>> a slightly longer regex makes such a difference-- does it start to
>>>> split it?
>>>
>>> Perhaps second one uses memory cache data? Run both twice and show us t=
he second times.
>>>
>>
>> Nope, same.
>>
>> [crees@pegasus]~% time sed -ne "s,^BitchX-[0-9][^|]*[\|]/usr/por,,"
>> /var/db/pkg/INDEX-9
>> 4.703u 0.007s 0:04.85 96.9% 40+2732k 210+0io 0pf+0w
>> [crees@pegasus]~% time sed -ne "s,^BitchX-[0-9][^|]*[\|]/usr/por,,"
>> /var/db/pkg/INDEX-9
>> 4.748u 0.007s 0:04.75 99.7% 40+2732k 0+0io 0pf+0w
>>
>> I also get the same on head;
>>
>> [crees@medusa]~% time sed -ne "s,^BitchX-[0-9][^|]*[\|]/usr/por,,"
>> /var/db/pkg/INDEX-10
>> 7.813u 0.015s 0:07.96 98.2% 40+183k 0+0io 0pf+0w
>> [crees@medusa]~% time sed -ne "s,^BitchX-[0-9][^|]*[\|]/usr/po,,"
>> /var/db/pkg/INDEX-10
>> 0.070u 0.000s 0:00.07 100.0% 45+205k 0+0io 0pf+0w
>> [crees@medusa]~% uname -a
>> FreeBSD medusa 10.0-CURRENT FreeBSD 10.0-CURRENT #0 r250009: Thu May
>> 30 10:11:16 BST 2013     root@medusa:/usr/obj/usr/src/sys/MEDUSA
>> amd64
>>
>> Chris
>
> Yes I tried too on -current. And I tried also on GNU/Linux and there
> isn't this problem. Is it gnu or bsd sed ?
>

BSD sed, GNU sed doesn't show this;

[crees@pegasus]/usr/ports/textproc/gsed% time gsed -ne
"s,^BitchX-[0-9][^|]*[\|]/usr/por,," /var/db/pkg/INDEX-9
0.019u 0.009s 0:00.04 25.0% 408+6132k 1+0io 2pf+0w

Chris



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADLo83_EdAxBtGv5wM-ZmbTxxs66yLzAiW%2Bo6bmunY6q0DZ=wQ>