From owner-freebsd-security@FreeBSD.ORG Wed Aug 28 02:27:30 2013 Return-Path: Delivered-To: freebsd-security@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 41049739; Wed, 28 Aug 2013 02:27:30 +0000 (UTC) (envelope-from jmg@h2.funkthat.com) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id EA2422321; Wed, 28 Aug 2013 02:27:29 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id r7S2RT4o001997 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 27 Aug 2013 19:27:29 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id r7S2RSZu001996; Tue, 27 Aug 2013 19:27:28 -0700 (PDT) (envelope-from jmg) Date: Tue, 27 Aug 2013 19:27:28 -0700 From: John-Mark Gurney To: Ollivier Robert Subject: Re: security/openssl speed issues Message-ID: <20130828022728.GR29777@funkthat.com> Mail-Followup-To: Ollivier Robert , dinoex@freebsd.org, freebsd-security@freebsd.org, freebsd-ports@freebsd.org References: <20130827153205.GA48196@roberto02-aw.eurocontrol.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130827153205.GA48196@roberto02-aw.eurocontrol.fr> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Tue, 27 Aug 2013 19:27:29 -0700 (PDT) Cc: dinoex@freebsd.org, freebsd-security@freebsd.org, freebsd-ports@freebsd.org X-BeenThere: freebsd-security@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Security issues \[members-only posting\]" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 02:27:30 -0000 Ollivier Robert wrote this message on Tue, Aug 27, 2013 at 17:32 +0200: > As I got a new machine with the AES-NI crypto extensions, I'm getting interested with it and as you may have seen, I've already merged into stable/9 two changesets for AES-NI support in GELI & cryptodev. > > Now, I'm trying to measure the impact of said AES extentions, I tumbled on a very weird difference in behaviour between our base system openssl and the one in ports. > > /usr/bin/openssl: > OpenSSL 0.9.8y 5 Feb 2013 > > /usr/local/bin/openssl: > OpenSSL 1.0.1e 11 Feb 2013 > > The one is base is not supposed to have cryptodev (and aesni) support at all as it was added apparently in 1.0.1. Fine. > > 1. Trying to run both on a machine without the AES-NI extensions, I should have similar results in running speed tests but: > > 1181 [17:18] roberto@centre:/usr/ports> /usr/bin/openssl speed aes-256-cbc This is not a very good way to run the tests... It turns out that if you run it this way, it will NOT go through the EVP OpenSSL system, and instead call a slow AES function directly, which will not be the one that gets used in real applications... To use the EVP system, and see what performance OpenSSH and others will get, use openssl speed -evp aes-256-cbc Now if you have cryptodev+aesni loaded and my aesni patches applied, you'll see something like this: $ openssl speed -evp aes-256-cbc Doing aes-256-cbc for 3s on 16 size blocks: 928863 aes-256-cbc's in 0.20s Doing aes-256-cbc for 3s on 64 size blocks: 880075 aes-256-cbc's in 0.28s Doing aes-256-cbc for 3s on 256 size blocks: 775018 aes-256-cbc's in 0.20s Doing aes-256-cbc for 3s on 1024 size blocks: 490425 aes-256-cbc's in 0.09s Doing aes-256-cbc for 3s on 8192 size blocks: 102189 aes-256-cbc's in 0.05s OpenSSL 1.0.1e-freebsd 11 Feb 2013 built on: Tue Aug 27 11:52:46 PDT 2013 options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: cc -O The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-cbc 76092.46k 200265.96k 1015831.59k 5843725.96k 17858822.14k Man, 17GB/sec! Impressive! Except not... notice the times above.. it only took .05s to do it.. That's because by default, openssl speed only computes user time, not real time and since it's now properly using cryptodev, not much cpu time is spent in the process... The undocumented -elapsed option to the rescue: $ openssl speed -elapsed -evp aes-256-cbc You have chosen to measure elapsed time instead of user CPU time. Doing aes-256-cbc for 3s on 16 size blocks: 962245 aes-256-cbc's in 3.01s Doing aes-256-cbc for 3s on 64 size blocks: 918001 aes-256-cbc's in 3.02s Doing aes-256-cbc for 3s on 256 size blocks: 799186 aes-256-cbc's in 3.05s Doing aes-256-cbc for 3s on 1024 size blocks: 496954 aes-256-cbc's in 3.02s Doing aes-256-cbc for 3s on 8192 size blocks: 102473 aes-256-cbc's in 3.00s OpenSSL 1.0.1e-freebsd 11 Feb 2013 built on: Tue Aug 27 11:52:46 PDT 2013 options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: cc -O The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-cbc 5118.64k 19432.21k 67148.02k 168312.03k 279819.61k This gives a bit more resonable results... And if you specify -decrypt also, you'll see it's faster because w/ my aesni patches, it pipelines cbc decrypt: $ openssl speed -elapsed -decrypt -evp aes-256-cbc You have chosen to measure elapsed time instead of user CPU time. Doing aes-256-cbc for 3s on 16 size blocks: 941128 aes-256-cbc's in 3.02s Doing aes-256-cbc for 3s on 64 size blocks: 950875 aes-256-cbc's in 3.03s Doing aes-256-cbc for 3s on 256 size blocks: 922503 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 1024 size blocks: 750362 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 208038 aes-256-cbc's in 3.03s OpenSSL 1.0.1e-freebsd 11 Feb 2013 built on: Tue Aug 27 11:52:46 PDT 2013 options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: cc -O The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-cbc 4993.34k 20076.21k 78720.26k 256123.56k 562225.91k So, the good news is that cryptodev does properly work for OpenSSL... And even better news (if you unload cryptodev), OpenSSL does work with AESNI: $ openssl speed -evp aes-256-cbc Doing aes-256-cbc for 3s on 16 size blocks: 56809161 aes-256-cbc's in 3.02s Doing aes-256-cbc for 3s on 64 size blocks: 15626726 aes-256-cbc's in 3.01s Doing aes-256-cbc for 3s on 256 size blocks: 4289939 aes-256-cbc's in 3.04s Doing aes-256-cbc for 3s on 1024 size blocks: 1090334 aes-256-cbc's in 3.02s Doing aes-256-cbc for 3s on 8192 size blocks: 136649 aes-256-cbc's in 2.99s OpenSSL 1.0.1e-freebsd 11 Feb 2013 built on: Tue Aug 27 11:52:46 PDT 2013 options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: cc -O The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-cbc 300633.49k 332504.26k 361369.46k 370239.01k 374117.13k and decrypt is even faster: $ openssl speed -decrypt -evp aes-256-cbc Doing aes-256-cbc for 3s on 16 size blocks: 61585473 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 64 size blocks: 60938441 aes-256-cbc's in 3.02s Doing aes-256-cbc for 3s on 256 size blocks: 34632152 aes-256-cbc's in 3.01s Doing aes-256-cbc for 3s on 1024 size blocks: 9779213 aes-256-cbc's in 3.03s Doing aes-256-cbc for 3s on 8192 size blocks: 1320658 aes-256-cbc's in 3.05s OpenSSL 1.0.1e-freebsd 11 Feb 2013 built on: Tue Aug 27 11:52:46 PDT 2013 options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: cc -O The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-cbc 328455.86k 1289942.40k 2947600.93k 3303559.29k 3550795.60k Just for people wondering what CPU: CPU: AMD A10-5700 APU with Radeon(tm) HD Graphics (3393.89-MHz K8-class CPU) I guess now we need to figure out how to teach OpenSSL to use AES-NI natively even when /dev/crypto is available... but at least we did solve the (non-)issue of bad OpenSSL performance... I will submit a patch to OpenSSL to not make the documentation of the -elapsed option dependent on defines... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."