From owner-freebsd-hackers Tue Feb 4 21:50:51 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id VAA05508 for hackers-outgoing; Tue, 4 Feb 1997 21:50:51 -0800 (PST) Received: from dg-rtp.dg.com (dg-rtp.rtp.dg.com [128.222.1.2]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id VAA05503 for ; Tue, 4 Feb 1997 21:50:48 -0800 (PST) Received: by dg-rtp.dg.com (5.4R3.10/dg-rtp-v02) id AA07471; Wed, 5 Feb 1997 00:50:12 -0500 Received: from ponds by dg-rtp.dg.com.rtp.dg.com; Wed, 5 Feb 1997 00:50 EST Received: from lakes.water.net (lakes [10.0.0.3]) by ponds.water.net (8.8.3/8.7.3) with ESMTP id VAA25506; Tue, 4 Feb 1997 21:43:06 -0500 (EST) Received: (from rivers@localhost) by lakes.water.net (8.8.3/8.6.9) id VAA23395; Tue, 4 Feb 1997 21:47:32 -0500 (EST) Date: Tue, 4 Feb 1997 21:47:32 -0500 (EST) From: Thomas David Rivers Message-Id: <199702050247.VAA23395@lakes.water.net> To: patrick@xinside.com, ponds!freebsd.org!freebsd-hackers Subject: Re: [Fwd: freebsd performance.] Content-Type: text Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Patrick Giagnocavo writes: > Julian Elischer writes: > > > The posix regex library is VERY VERY slow. > > > > I have a program that uses a large regex to parse some input. > > > > I have a version in perl and a version in C++ using the freebsd posix > > regex library. > > > > The perl version is 100X faster that the C++ version. > > > > gprof on the C++ version shows 99% of the spend in: > > > > 91.53 46.17 46.17 1152366 0.04 0.04 lstep > > 6.52 49.46 3.29 98497 0.03 0.47 lslow > > I am surprised that some of our more erudite members on the list have > not jumped on this. So, a definitely non-erudite person will. > ... > > There are two different 'engines' - NFA (nondeterministic finite > automaton) and DFA (deterministic finite automaton). Perl is > 'traditional NFA' according to Mr. Friedl, while POSIX leans towards > DFA-like behavior in all cases (always returns 'leftmost-longest' that > matches - perl returns I believe the first part that matches). Ah, but you should remember that all NFAs are convertable to DFAs. > > Also, Perl does not 'do' POSIX IIRC; so the results can actually be > different when using the same regex string. Perl does however have > some very powerful features for its regex - covered in the book. This could be part of the reason for a performance difference; if they are not performing the same operation - you're comparing apples and oranges. - Dave Rivers -