From owner-freebsd-arch@FreeBSD.ORG Mon Oct 21 18:53:09 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id CB701E44; Mon, 21 Oct 2013 18:53:09 +0000 (UTC) (envelope-from jmg@h2.funkthat.com) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A176E2D17; Mon, 21 Oct 2013 18:53:09 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id r9LIr8F4054480 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 21 Oct 2013 11:53:08 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id r9LIr8oY054479; Mon, 21 Oct 2013 11:53:08 -0700 (PDT) (envelope-from jmg) Date: Mon, 21 Oct 2013 11:53:08 -0700 From: John-Mark Gurney To: Poul-Henning Kamp Subject: Re: always load aesni or load it when cpu supports it Message-ID: <20131021185308.GA56872@funkthat.com> Mail-Followup-To: Poul-Henning Kamp , Andre Oppermann , Mark R V Murray , freebsd-arch@freebsd.org References: <20131020070022.GP56872@funkthat.com> <423D921D-6CE5-49D9-BCED-AB14EB236800@grondar.org> <20131020161634.GQ56872@funkthat.com> <5264F074.4010607@freebsd.org> <20131021164034.GU56872@funkthat.com> <37693.1382379728@critter.freebsd.dk> <20131021182834.GX56872@funkthat.com> <37748.1382380333@critter.freebsd.dk> <20131021183658.GY56872@funkthat.com> <37803.1382380956@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <37803.1382380956@critter.freebsd.dk> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Mon, 21 Oct 2013 11:53:09 -0700 (PDT) Cc: Andre Oppermann , Mark R V Murray , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Oct 2013 18:53:10 -0000 Poul-Henning Kamp wrote this message on Mon, Oct 21, 2013 at 18:42 +0000: > In message <20131021183658.GY56872@funkthat.com>, John-Mark Gurney writes: > >Poul-Henning Kamp wrote this message on Mon, Oct 21, 2013 at 18:32 +0000: > > >Clearly you didn't completely read my first email, so you're proposing > >that we ALWAYS use software AES and never use AES-NI? At least in the > >context of my email, that is what the above statement says... > > No, what I'm saying is that we should offer two APIs: One synchronous > and one for IPSEC and any other async usage (personally I can't think > of any but...) Ahh, that is very different than disagreeing w/ my size split between the two API's... Thanks for stating your disagreement w/ my original proposal... > Those APIs should do whatever is fastest, for the request it gets. Except that it isn't that simple... AES-NI isn't free in the kernel because we have to dump FPU context and do other work that means for single block AES, it's probably faster to do pure software than doing the FPU work necessary to use AES-NI... Also, my proposal was how to get us closer to the end goal w/o breaking the entire kernel... > We do *not* want to pollute all crypto-using code with heuristics to > guess which API to call for which request and when. Except that the using the API knows what they are likely to do, is it just one block here and there, or tons of large blocks? you're the one that wants a "simple synchronous" API, except that we can't have a performant implementation and a simple synchronous API... We get the performance by moving the expensive parts outside the inner loop... which can only be achieved by a more complicated API... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."