Date: Mon, 09 May 2011 17:15:18 -0700 From: Bakul Shah <bakul@bitblocks.com> To: David Schultz <das@FreeBSD.ORG> Cc: Gabor Kovesdan <gabor@kovesdan.org>, "Pedro F. Giffuni" <giffunip@yahoo.com>, hackers@FreeBSD.ORG, Brooks Davis <brooks@FreeBSD.ORG>, Zhihao Yuan <lichray@gmail.com> Subject: Re: [RFC] Replacing our regex implementation Message-ID: <20110510001518.2C855B827@mail.bitblocks.com> In-Reply-To: Your message of "Mon, 09 May 2011 17:51:46 EDT." <20110509215146.GA18135@zim.MIT.EDU> References: <4DC7356C.20905@kovesdan.org> <20110509011709.5455CB827@mail.bitblocks.com> <4DC74546.1060902@kovesdan.org> <20110509014938.EE292B827@mail.bitblocks.com> <BANLkTim-T4m=jUfXT_wFAv3n=H6QG2N1iQ@mail.gmail.com> <20110509061334.A62EAB827@mail.bitblocks.com> <20110509215146.GA18135@zim.MIT.EDU>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 09 May 2011 17:51:46 EDT David Schultz <das@FreeBSD.ORG> wrote: > On Sun, May 08, 2011, Bakul Shah wrote: > > On Sun, 08 May 2011 21:35:04 CDT Zhihao Yuan <lichray@gmail.com> wrote: > > > 1. This lib accepts many popular grammars (PCRE, POSIX, vim, etc.), > > > but it does not allow you to change the mode. > > > http://code.google.com/p/re2/source/browse/re2/re2.h > > > > The mode is decided when an RE2 object is instantiated so this > > is ok. You can certainly instantiate multiple objects with > > different options if so desired. > > > > > 2. It focuses on speed and features, not stability and standardization. > > > > Look at the open issues. Seems stable enough to me. re2 has a > > posix only mode. It also does unicode. s/posix only mode/posix only mode as well/ > > > > > 3. It uses C++. We seldom accepts C++ code in base system, and does > > > not accept it in libc. > > > > This is the show stopper. > > Use of C++ is a clear show-stopper if it introduces new runtime > requirements, e.g., dependencies on STL or exceptions. Aside from > that, however, I can't think of any fundamental, technical reasons > why a component of libc couldn't be written in C++. (Perhaps the > toolchain maintainers could name some, and they'd be the best > authority on the matter.) You can expect some resistance > regardless, however, so make sure the technical merits of RE2 are > worth the trouble. Ok, I just verified there are no additional runtime requirements by running a simple test, where I added a C wrapper around an RE2 C++ call, compiled it with c++, then compiled the client C code with cc, and linked everything with cc. This works (tested on on x86_64, under 8.1). I do think RE2 is very well done (see swtch.com/~rsc/regexp articles) and it is actively maintained, has a battery of pretty exhaustive tests. Seems TRE's author also likes re2: http://hackerboss.com/is-your-regex-matcher-up-to-snuff/ So if we want to consider this, it is a real possibility. > IIRC, some of the prior discussions on using more C++ in the base > system got derailed by tangents on multiple inheritance, operator > overloading, misfeatures of STL, and what subset of C++ ought to > be considered kosher in FreeBSD. You don't have to get involved > in any of that because you'd only be proposing to import a > self-contained third-party library. Indeed; we would just use it via a C wrapper API. But I can see someone thinking this is the camel's nose in the tent :-)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110510001518.2C855B827>