From owner-freebsd-bugs@FreeBSD.ORG Sat Sep 4 09:40:10 2004 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 92E1416A4CF for ; Sat, 4 Sep 2004 09:40:10 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 66D1A43D5C for ; Sat, 4 Sep 2004 09:40:10 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.11/8.12.11) with ESMTP id i849eAYv081885 for ; Sat, 4 Sep 2004 09:40:10 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i849eAqb081884; Sat, 4 Sep 2004 09:40:10 GMT (envelope-from gnats) Resent-Date: Sat, 4 Sep 2004 09:40:10 GMT Resent-Message-Id: <200409040940.i849eAqb081884@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Kuang-che Wu Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7753C16A4CE for ; Sat, 4 Sep 2004 09:39:19 +0000 (GMT) Received: from mail4out.giga.net.tw (mail4out.giga.net.tw [203.133.1.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 401CB43D45 for ; Sat, 4 Sep 2004 09:39:19 +0000 (GMT) (envelope-from kcwu@kcwu.homeip.net) Received: from kcwu.homeip.net (61-70-142-187.adsl.static.giga.net.tw [61.70.142.187]) by mail4out.giga.net.tw (Postfix) with ESMTP id 319A059EC for ; Sat, 4 Sep 2004 17:37:58 +0800 (CST) Received: from kcwu.homeip.net (kc@kcwu.homeip.net [127.0.0.1]) by kcwu.homeip.net (8.13.1/8.13.1) with ESMTP id i849dX9v096863 for ; Sat, 4 Sep 2004 17:39:34 +0800 (CST) (envelope-from kcwu@kcwu.homeip.net) Received: (from kcwu@localhost) by kcwu.homeip.net (8.13.1/8.13.1/Submit) id i849dXYC096862; Sat, 4 Sep 2004 17:39:33 +0800 (CST) (envelope-from kcwu) Message-Id: <200409040939.i849dXYC096862@kcwu.homeip.net> Date: Sat, 4 Sep 2004 17:39:33 +0800 (CST) From: Kuang-che Wu To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Subject: bin/71367: regex multibyte support is really slow X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Kuang-che Wu List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Sep 2004 09:40:10 -0000 >Number: 71367 >Category: bin >Synopsis: regex multibyte support is really slow >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Sep 04 09:40:09 GMT 2004 >Closed-Date: >Last-Modified: >Originator: Kuang-che Wu >Release: FreeBSD 6.0-CURRENT i386 >Organization: >Environment: System: FreeBSD kcwu.homeip.net 6.0-CURRENT FreeBSD 6.0-CURRENT #0: Sat Sep 4 05:33:38 CST 2004 root@kcwu.homeip.net:/usr/obj/usr/src/sys/DESKTOP i386 CPU: AMD Athlon(tm) XP 2000+ (1665.59-MHz 686-class CPU) >Description: regex in UTF-8 locale + flag REG_EXTENDED|REG_ICASE + pattern [[:alnum:]] = unacceptable slow >How-To-Repeat: $ cc -O -pipe re.c -o re $ time ./re 7.65 real 7.51 user 0.06 sys #include #include #include int main(void) { regex_t re; char string[1024]={ #define WORD 0xe6,0x85,0xa2 /* UTF-8 character */ WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, WORD, 0 }; if(setlocale(LC_CTYPE,"zh_TW.UTF-8")==NULL) return 1; if(regcomp(&re,"[[:alnum:]]",REG_EXTENDED|REG_ICASE)!=0) return 2; if(regexec(&re,string,0,NULL,0)==0) printf("matched\n"); return 0; } >Fix: >Release-Note: >Audit-Trail: >Unformatted: