From owner-freebsd-hackers Thu Oct 4 9: 4:21 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from avocet.mail.pas.earthlink.net (avocet.mail.pas.earthlink.net [207.217.121.50]) by hub.freebsd.org (Postfix) with ESMTP id C2EE437B405 for ; Thu, 4 Oct 2001 09:04:19 -0700 (PDT) Received: from pool0208.cvx4-bradley.dialup.earthlink.net ([209.178.146.208] helo=sm.socccd.cc.ca.us) by avocet.mail.pas.earthlink.net with esmtp (Exim 3.32 #2) id 15pAyk-0003eJ-00; Thu, 04 Oct 2001 09:04:07 -0700 Message-ID: <3BBC8937.44A77CF9@sm.socccd.cc.ca.us> Date: Thu, 04 Oct 2001 09:07:19 -0700 From: Farooq Mela X-Mailer: Mozilla 4.76 [en] (X11; U; FreeBSD 4.4-STABLE i386) X-Accept-Language: en MIME-Version: 1.0 To: "Andrew L. Neporada" Cc: Bakul Shah , freebsd-hackers@freebsd.org Subject: Re: current strstr(3) implementation is slow References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Andrew L. Neporada" wrote: > to Algorithms" book (AKA CLR) by MIT Press, more precisely "Knuth, Morris > and Pratt algorithm". Hmm, sounds good, much better than naive string matching algorithm. However I think it would be a good idea to check if the length of the pattern <= architecture's word size (i386 is 32, alpha is 64, etc), and if so use shift-or matching [Baeza-Yates & Gonnet, "A New Approach to Text Searching," CACM 35, Oct. 1992, 74-82]. Shift-or matching only requires an array of 256 * word size bits which can be on the stack, but KMP matching needs O(M) which must be malloc'ed. Shift-or is usually considerably faster than KMP algorithm, see paper for details. -- farooq To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message