From owner-soc-status@FreeBSD.ORG Sat May 28 07:07:38 2011 Return-Path: Delivered-To: soc-status@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 712321065670 for ; Sat, 28 May 2011 07:07:38 +0000 (UTC) (envelope-from gabor@kovesdan.org) Received: from server.mypc.hu (server.mypc.hu [87.229.73.95]) by mx1.freebsd.org (Postfix) with ESMTP id 2F31F8FC15 for ; Sat, 28 May 2011 07:07:37 +0000 (UTC) Received: from server.mypc.hu (localhost [127.0.0.1]) by server.mypc.hu (Postfix) with ESMTP id D8BF314E5652 for ; Sat, 28 May 2011 08:51:24 +0200 (CEST) X-Virus-Scanned: amavisd-new at server.mypc.hu Received: from server.mypc.hu ([127.0.0.1]) by server.mypc.hu (server.mypc.hu [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 37gxVhL8PH3l for ; Sat, 28 May 2011 08:51:23 +0200 (CEST) Received: from [193.137.158.203] (unknown [193.137.158.203]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by server.mypc.hu (Postfix) with ESMTPSA id A304814E55D8 for ; Sat, 28 May 2011 08:51:23 +0200 (CEST) Message-ID: <4DE09B6C.1050209@kovesdan.org> Date: Sat, 28 May 2011 07:51:24 +0100 From: Gabor Kovesdan User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; pt-PT; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10 MIME-Version: 1.0 To: soc-status@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: regex status report #1 X-BeenThere: soc-status@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Summer of Code Status Reports and Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 May 2011 07:07:38 -0000 Hi, I've made some tests with TRE and so far found two incompatibilities with base regex: 1, The curly bracketed repetition expressions can have the lowe bound missing, where 0 is inferred. This behaviour is more permissive so probably won't be a problem. 2, Missing REG_STARTEND flag that could be used with regexec(). This has been implemented. I've got it built inside libc and it works well. It has a literal matching mode, which is very efficient. However, in some conditions it underperforms our base regex. Now I'm looking at this. I would like to improve the performance a bit before I publish a patch for testing. Gabor