From owner-freebsd-arch@FreeBSD.ORG Sun Nov 21 18:06:11 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 237D016A4CE for ; Sun, 21 Nov 2004 18:06:11 +0000 (GMT) Received: from mail.trippynames.com (mail.trippynames.com [38.113.223.19]) by mx1.FreeBSD.org (Postfix) with ESMTP id D550343D2D for ; Sun, 21 Nov 2004 18:06:10 +0000 (GMT) (envelope-from sean@chittenden.org) Received: from localhost (localhost [127.0.0.1]) by mail.trippynames.com (Postfix) with ESMTP id 31FB0A6D2A; Sun, 21 Nov 2004 10:06:08 -0800 (PST) Received: from mail.trippynames.com ([127.0.0.1]) by localhost (rand.nxad.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 82921-10; Sun, 21 Nov 2004 10:06:06 -0800 (PST) Received: from [192.168.1.2] (dsl081-069-073.sfo1.dsl.speakeasy.net [64.81.69.73]) by mail.trippynames.com (Postfix) with ESMTP id 894FDA5A89; Sun, 21 Nov 2004 10:06:06 -0800 (PST) In-Reply-To: References: <16795.57534.19299.407779@piglet.timing.com> Mime-Version: 1.0 (Apple Message framework v619) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: <01E8B7B2-3BE8-11D9-905D-000A95C705DC@chittenden.org> Content-Transfer-Encoding: 7bit From: Sean Chittenden Date: Sun, 21 Nov 2004 10:06:03 -0800 To: des@des.no (=?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?=) X-Mailer: Apple Mail (2.619) cc: freebsd-arch@freebsd.org Subject: Re: libregex library X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Nov 2004 18:06:11 -0000 >> Has there been any thought given to moving to the modified Henry >> Spencer regex library used in NetBSD & OpenBSD's libc? > > des@dwp ~% head -3 /usr/src/lib/libc/regex/COPYRIGHT > Copyright 1992, 1993, 1994 Henry Spencer. All rights reserved. > This software is not subject to any license of the American Telephone > and Telegraph Company or of the Regents of the University of > California. I think maybe what Ben was referring to was that Spencer has released an updated version of his regexp library that doesn't penalize wide character locales. I believe our current one performs terribly on everything but one byte character sets, whereas the newer Spencer library performs as well as one could hope with wide characters. The PostgreSQL group did some testing and found Spencers library to be the fastest wide character regexp engine while still maintaining very good levels of performance for single byte character sets. You'll have to check the PostgreSQL archives for details: it's been two years since that change was committed to their tree. -sc -- Sean Chittenden