Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 09 May 2011 02:37:10 +0100
From:      Gabor Kovesdan <gabor@kovesdan.org>
To:        Bakul Shah <bakul@bitblocks.com>
Cc:        "Pedro F. Giffuni" <giffunip@yahoo.com>, hackers@FreeBSD.org, Brooks Davis <brooks@freebsd.org>
Subject:   Re: [RFC] Replacing our regex implementation
Message-ID:  <4DC74546.1060902@kovesdan.org>
In-Reply-To: <20110509011709.5455CB827@mail.bitblocks.com>
References:  <4DC7356C.20905@kovesdan.org> <20110509011709.5455CB827@mail.bitblocks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Em 09-05-2011 02:17, Bakul Shah escreveu:
> As per the following URLs re2 is much faster than TRE (on the
> benchmarks they ran):
>
> http://lh3lh3.users.sourceforge.net/reb.shtml
> http://sljit.sourceforge.net/regex_perf.html
>
> re2 is in C++&  has a PCRE API, while TRE is in C&  has a
> POSIX API.  Both have BSD copyright. Is it worth considering
> making re2 posix compliant?
Is it wchar-clean and is it actively maintained? C++ is quite 
anticipated for the base system and I'm not very skilled in it so atm I 
couldn't promise to use re2 instead of TRE. And anyway, can C++ go into 
libc? According to POSIX, the regex code has to be there. But let's see 
what others say... If we happen to use re2 later, my extensions that I 
talked about in points 2, and 3, would still be useful.

Anyway, according to some earlier vague measures, TRE seems to be slower 
in small matching tasks but scales well. These tests seem to compare 
only short runs with the same regex. It should be seem how they compare 
e.g. if you grep the whole ports tree with the same pattern. If the 
matching scales well once the pattern is compiled, that's more important 
than the overall result for such short tasks, imho.

Gabor



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DC74546.1060902>