From owner-soc-status@FreeBSD.ORG Mon Aug 1 00:06:47 2011 Return-Path: Delivered-To: soc-status@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4DE4B106566C for ; Mon, 1 Aug 2011 00:06:47 +0000 (UTC) (envelope-from gabor@FreeBSD.org) Received: from server.mypc.hu (server.mypc.hu [87.229.73.95]) by mx1.freebsd.org (Postfix) with ESMTP id 0630B8FC17 for ; Mon, 1 Aug 2011 00:06:46 +0000 (UTC) Received: from server.mypc.hu (localhost [127.0.0.1]) by server.mypc.hu (Postfix) with ESMTP id 1BA0B14E5BFC for ; Mon, 1 Aug 2011 02:06:46 +0200 (CEST) X-Virus-Scanned: amavisd-new at server.mypc.hu Received: from server.mypc.hu ([127.0.0.1]) by server.mypc.hu (server.mypc.hu [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 2eaIm2Z3VUhM for ; Mon, 1 Aug 2011 02:06:44 +0200 (CEST) Received: from [192.168.1.105] (catv-80-98-232-12.catv.broadband.hu [80.98.232.12]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by server.mypc.hu (Postfix) with ESMTPSA id 1E94714E5BF5 for ; Mon, 1 Aug 2011 02:06:44 +0200 (CEST) Message-ID: <4E35EE14.6060403@FreeBSD.org> Date: Mon, 01 Aug 2011 01:06:44 +0100 From: Gabor Kovesdan User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: soc-status@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: regex status report #10 X-BeenThere: soc-status@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Summer of Code Status Reports and Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Aug 2011 00:06:47 -0000 Hi, I reworked the fixed string matching code quite a bit and now it seems to run correctly without segfaults. I also made some cleanups, added support for REG_ICASE, which was missing so far. In this current state, now it runs significantly faster for fixed string pattern than the unpatched TRE. It still uses the quick search algorithm, now I am experimenting with Boyer-Moore to get even more out of it. It is quite important to do this well because this will also be the foundation of the heuristical matching, which is the next major step. I'm testing the performance with BSD grep but it may have some other bottlenecks so it may be necessary (and useful since the ultimate goal is to get rid of the GNU bits) to look at it, as well. I arrived back to Hungary yesterday from my Portuguese internship so probably I'll make a bit less progress during the next week until I settle down here again but I'll try my best. Gabor