From owner-freebsd-hackers Thu Jan 30 03:04:47 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id DAA04669 for hackers-outgoing; Thu, 30 Jan 1997 03:04:47 -0800 (PST) Received: from ki1.chemie.fu-berlin.de (ki1.Chemie.FU-Berlin.DE [160.45.24.21]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id DAA04663 for ; Thu, 30 Jan 1997 03:04:41 -0800 (PST) Received: by ki1.chemie.fu-berlin.de (Smail3.1.28.1) from mail.hanse.de (193.174.9.9) with smtp id ; Thu, 30 Jan 97 12:04 MET Received: from wavehh.UUCP by mail.hanse.de with UUCP for Freebsd-hackers@freebsd.org id ; Thu, 30 Jan 97 12:04 MET Received: by wavehh.hanse.de (4.1/SMI-4.1) id AA20685; Thu, 30 Jan 97 11:10:09 +0100 Date: Thu, 30 Jan 97 11:10:09 +0100 From: cracauer@wavehh.hanse.de (Martin Cracauer) Message-Id: <9701301010.AA20685@wavehh.hanse.de> To: julian@whistle.COM Cc: Freebsd-hackers@freebsd.org, jason@idiom.com Subject: Re: [Fwd: freebsd performance.] Newsgroups: hanse-ml.freebsd.hackers References: <32EFD2E2.167EB0E7@whistle.com> Reply-To: cracauer@wavehh.hanse.de Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >From: Jason Venner >The posix regex library is VERY VERY slow. >I have a program that uses a large regex to parse some input. >I have a version in perl and a version in C++ using the freebsd posix >regex library. >The perl version is 100X faster that the C++ version. >gprof on the C++ version shows 99% of the spend in: > 91.53 46.17 46.17 1152366 0.04 0.04 lstep > 6.52 49.46 3.29 98497 0.03 0.47 lslow Adding to the explanations about different kinds of regular expressions by Patrick Giagnocavo, I'd like to note that I found *BSD's regex support to be quite fast. I benchmarked Spencers code against various versions of GNU regex stuff and found Spencers library to be about 30% faster than the GNU stuff from GNU-sed-2.x and that the GNU sed-3.00/regex-1.0 stuff didn't even pass my benchmarks without coredumping. I didn't take perl's library into account because I found no easy way to use it from C. I'd say it is a valid goal to make perl's regex library availiable as a C library (maybe someone already did). For different regex syntax anyway and maybe for performance also. I'd like to second the request for a simplified test case that shows the difference so we can investigate the problem. Jason, do you copy, can you provide a regex and a testfile? I tend to say some easy-to-solve thing must have been triggered. After all, perl's regex code is derived from Spencer's, too. Martin -- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Martin_Cracauer@wavehh.hanse.de http://cracauer.cons.org Fax.: +4940 5228536 "As far as I'm concerned, if something is so complicated that you can't ex- plain it in 10 seconds, then it's probably not worth knowing anyway"- Calvin