From owner-freebsd-current@FreeBSD.ORG Thu May 14 07:30:25 2015 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 54504C60; Thu, 14 May 2015 07:30:25 +0000 (UTC) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "gold.funkthat.com", Issuer "gold.funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 2D30C1880; Thu, 14 May 2015 07:30:24 +0000 (UTC) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.14.5/8.14.5) with ESMTP id t4E7UOtG062071 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 14 May 2015 00:30:24 -0700 (PDT) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.14.5/8.14.5/Submit) id t4E7UO79062070; Thu, 14 May 2015 00:30:24 -0700 (PDT) (envelope-from jmg) Date: Thu, 14 May 2015 00:30:24 -0700 From: John-Mark Gurney To: David Chisnall Cc: Poul-Henning Kamp , Baptiste Daroussin , current@FreeBSD.org Subject: Re: Increase BUFSIZ to 8192 Message-ID: <20150514073024.GW37063@funkthat.com> References: <20150511230635.GA46991@ivaldir.etoilebsd.net> <20150512032307.GP37063@funkthat.com> <14994.1431412293@critter.freebsd.dk> <20150513080342.GE37063@funkthat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: FreeBSD 9.1-PRERELEASE amd64 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (gold.funkthat.com [127.0.0.1]); Thu, 14 May 2015 00:30:24 -0700 (PDT) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 May 2015 07:30:25 -0000 David Chisnall wrote this message on Wed, May 13, 2015 at 09:27 +0100: > On 13 May 2015, at 09:03, John-Mark Gurney wrote: > > > > Poul-Henning Kamp wrote this message on Tue, May 12, 2015 at 06:31 +0000: > >> -------- > >> In message <20150512032307.GP37063@funkthat.com>, John-Mark Gurney writes: > >> > >>> Also, you'd probably see even better performance by increasing the > >>> size to 64k, [...] > >> > >> easy: > >> 8K on 32bit > >> 64k on 64bit > > > > Sounds good to me... Just for people who care... I did a quick set of > > benchmarks on sha256.. This is using my preliminary patch to use sse4 > > optimized sha256... But this should be the same for others... > > > > The numbers in ministat output are the time in seconds it takes my > > 3.4GHz AMD A10-5700 APU running HEAD to process a 512MB file, so lower > > numbers are better.. I've processed them into easier to read format: > > BUFSIZ: 145MB/sec > > 8k: 193MB/sec > > 16k: 198MB/sec > > 64k: 202MB/sec > > 128k: 202MB/sec > > -t: 211MB/sec > > It looks like most of the benefit is gained at 16KB. Did you try running the benchmark with something else running at the same time to see if there is any advantage in trashing the caches a bit less (simple case, what happens if you run two instances of the same benchmark at once)? > > I suspect that you???re about right anyway - I recently did some tests while playing with JavaScript FFI generation with a multithreaded process JavaScript environment calling out to OpenSSL to do SHA calculations and having each of 8 threads reading in 128KB chunks gave the fastest performance (Core i7, 4 cores + hyperthreading), with only a negligible gain over 64KB. In all cases, the JavaScript implementation was significantly faster than the openssl tool, which used 8KB buffers. Just in case anyone else wants to know how to run benchmarks themselves.. Go into /usr/src/lib/libmd, edit mdXhl.c, and change the occurence of BUFSIZ to what you want to test, say 64*1024, run: make all && make install and then you can run programs like sha256 -t, or: for i in `jot 5 1`; do /usr/bin/time sha256 test.file ; done 2> XXX.times Where test.file is populated maybe like: dd if=/dev/urandom of=test.file bs=1m count=512 Then run: ministat XXX.times YYY.times to compare multiple results... Happy benchmarking! -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."