From owner-freebsd-arch@FreeBSD.ORG Fri Feb 28 18:18:06 2014 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D8C42868 for ; Fri, 28 Feb 2014 18:18:06 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9CC28171F for ; Fri, 28 Feb 2014 18:18:06 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s1SII5Xd077871 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 28 Feb 2014 10:18:06 -0800 (PST) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s1SII5jL077870; Fri, 28 Feb 2014 10:18:05 -0800 (PST) (envelope-from jmg) Date: Fri, 28 Feb 2014 10:18:05 -0800 From: John-Mark Gurney To: Peter Jeremy Subject: Re: small kernel kernel option... Message-ID: <20140228181804.GQ47921@funkthat.com> Mail-Followup-To: Peter Jeremy , arch@freebsd.org References: <20140226214816.GB92037@funkthat.com> <20140228114224.GE2705@server.rulingia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140228114224.GE2705@server.rulingia.com> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Fri, 28 Feb 2014 10:18:06 -0800 (PST) Cc: arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Feb 2014 18:18:06 -0000 Peter Jeremy wrote this message on Fri, Feb 28, 2014 at 22:42 +1100: > On 2014-Feb-26 13:48:16 -0800, John-Mark Gurney wrote: > >I'm about to commit a change to sha256 to speed it up, but the cost > >of that speed up is an increase in code/data size from just under 1k > >to almost 9k (as measured on amd64)... this increase is from unrolling > >a loop.. > > Out of interest, how much of a speedup and what CPU/compiler > combinations did you test your change on? I ask because several years > ago, I tried about 7 different SHA-256 implementations (basically, all > the C ones I could easily find in FreeBSD and ports I had installed, > as well as one I tweaked myself) across a range of CPUs and compilers. > I found that not only was there a very wide variation in speed between > implementations but that the best on one CPU often ran quite poorly on > another and unrolling loops didn't necessarily help. I did not do an exhaustive search.. I only benchmarked the two easy ones, the one from libmd and the kernel one... I ran my tests on an A10-5700@3.4GHz, Core i7@2GHz (though under MacOSX) and an Opteron-4228 HE@2.8Ghz... All tests were on amd64... There were a few people who also ran the tests for me but I don't remeber what processors they ran on.. In all the cases we saw an improvement, and mostly saw a ~20% improvement by using cperciva's libmd version than the kernel version... These were proven w/ ministat... Part of the reason I didn't to an exhaustive search is that many implementations (OpenSSL, NSS) are very difficult to extract w/o major work.. If you'd like to run my test suite, it can be run by d/l'ing: https://www.funkthat.com/~jmg/sha256.test.unr.tgz just run the script gennumbers and wait for a while... it'll compile and validate and perform the tests.. This also includes the tests to test various numbers of loop unrolling which exposed some weird timing behavior... Enough that the only option is either unrolled or completely rolled, no partial loops.. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."