From owner-svn-src-stable-9@FreeBSD.ORG Sat May 4 11:21:42 2013 Return-Path: Delivered-To: svn-src-stable-9@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2F7E9F42 for ; Sat, 4 May 2013 11:21:42 +0000 (UTC) (envelope-from mailer-daemon@vniz.net) Received: from mail-lb0-f177.google.com (mail-lb0-f177.google.com [209.85.217.177]) by mx1.freebsd.org (Postfix) with ESMTP id A833710E9 for ; Sat, 4 May 2013 11:21:41 +0000 (UTC) Received: by mail-lb0-f177.google.com with SMTP id 13so2232820lba.36 for ; Sat, 04 May 2013 04:21:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:openpgp:content-type :content-transfer-encoding:x-gm-message-state; bh=pYULphCAcd6pRso0V99G46/2hddjxaHWEuBnXGlMWbc=; b=B4CL+4L9m5hVRJSd7qve/isoBIGLEZJMz5Ua0KYaRsdGke9U+vFjjA6ki+E0EzPPFK LjGbiuJ18RJsCZK9SekUCTvpwId/dt16xH29FEdSU3OYfHxjf99FbIVdsB4HwKA6BK1u sW7Avb8hSdXC5gieVGRlyOnrQYMeDbm9FbFuwkPqxvJQHdCt+W7gkgObQBH3HpKRzG+O stF1EC0umZk3KU7RhNZ2gw1sFZI9gCnryv6soO0yTjh16YD2AWBmvKn2YY6GORQfkaAQ MfxtCE0kDmqtcEzTMetExUqObNjEe+gA8u8qjinpunPFsqdJv2+JmHLQJ7vfu/7OKPfl dQvA== X-Received: by 10.112.171.7 with SMTP id aq7mr5619790lbc.130.1367666052680; Sat, 04 May 2013 04:14:12 -0700 (PDT) Received: from [192.168.1.2] ([89.169.163.3]) by mx.google.com with ESMTPSA id r9sm5494184lbr.3.2013.05.04.04.14.11 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 04 May 2013 04:14:12 -0700 (PDT) Message-ID: <5184ED7E.3040703@freebsd.org> Date: Sat, 04 May 2013 15:14:06 +0400 From: Andrey Chernov User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130328 Thunderbird/17.0.5 MIME-Version: 1.0 To: Sergey Kandaurov Subject: Re: svn commit: r250215 - stable/9/lib/libc/locale References: <201305031552.r43FqiPN024580@svn.freebsd.org> <5183E899.4000503@freebsd.org> <20130503195540.GA52657@stack.nl> In-Reply-To: OpenPGP: id=964474DD Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQkGS1OVarjPAAUNK3iSo9CkvuXFfngF5tImSftSDIjkv2gm8kfo5exiOllWve9Ol+CiGNA5 Cc: svn-src-stable@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, svn-src-stable-9@freebsd.org, Jilles Tjoelker X-BeenThere: svn-src-stable-9@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for only the 9-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 May 2013 11:21:42 -0000 On 04.05.2013 0:48, Sergey Kandaurov wrote: > On 3 May 2013 23:55, Jilles Tjoelker wrote: >> Some sort of perfect hashing can also be an option, although it makes it >> harder to add new properties or adds a build dependency on gperf(1) that >> we would like to get rid of. > I hacked a bit on wctype. Speaking about speed, it shows about 1-3.5x > improvement over the previous fast version (before r250215). > > Time spend for 2097152 wctype() calls for each of wctype property > current previous mine > alnum 0.090554676 0.035821210 0.033270579 > alpha 0.172074310 0.052461036 0.044916572 > blank 0.261109989 0.055735281 0.036682745 > cntrl 0.357318986 0.069249831 0.038292782 > digit 0.436381530 0.094194364 0.039249005 > graph 0.540954812 0.085580099 0.043331460 > lower 0.618306476 0.095665215 0.044070399 > print 0.707443135 0.132559305 0.048216097 > punct 0.788922052 0.142809109 0.062871432 > space 0.888263108 0.150516644 0.054086142 > upper 0.966903461 0.173593592 0.054027834 > xdigit 0.406611275 0.201614227 0.060695939 > ideogram 0.439763499 0.239640723 0.068566486 > special 0.523128094 0.249156298 0.099278051 > phonogram 0.564975870 0.260972651 0.135751471 > rune 0.637392247 0.235195497 0.064093971 > > Index: locale/wctype.c > =================================================================== > --- locale/wctype.c (revision 250217) > +++ locale/wctype.c (working copy) > @@ -74,6 +74,9 @@ > "special\0" /* BSD extension */ > "phonogram\0" /* BSD extension */ > "rune\0"; /* BSD extension */ > + static const size_t propnamlen[] = { > + 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 8, 7, 9, 4, 0 > + }; > static const wctype_t propmasks[] = { > _CTYPE_A|_CTYPE_D, > _CTYPE_A, > @@ -92,16 +95,17 @@ > _CTYPE_Q, > 0xFFFFFF00L > }; > - size_t len1, len2; > + const size_t *len2; > const char *p; > const wctype_t *q; > > - len1 = strlen(property); > q = propmasks; > - for (p = propnames; (len2 = strlen(p)) != 0; p += len2 + 1) { > - if (len1 == len2 && memcmp(property, p, len1) == 0) > + len2 = propnamlen; > + for (p = propnames; *len2 != 0; ) { > + if (property[0] == p[0] && strcmp(property, p) == 0) > return (*q); > - q++; > + p += *len2 + 1; > + q++; len2++; > } > > return (0UL); > This version looks better. IMHO adding full hashing here will be overkill, but much simpler trick still exist yet unused for speedup: sorting properties by character codes and break the loop as early as strcmp() returns bigger value. BTW, I don't run tests and look in asm code for sure, but it seems property[0] == p[0] is unneeded because almost every compiler tries to inline strcmp(). -- http://ache.vniz.net/ bitcoin:13fGiNutKNHcVSsgtGQ7bQ5kgUKgEQHn7N