From owner-freebsd-arch@FreeBSD.ORG Sat Oct 20 05:49:00 2012 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A0B047FF for ; Sat, 20 Oct 2012 05:49:00 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 197B88FC12 for ; Sat, 20 Oct 2012 05:48:58 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q9K5mxEK048623 for ; Sat, 20 Oct 2012 08:48:59 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q9K5mloF037101 for ; Sat, 20 Oct 2012 08:48:47 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q9K5mlSJ037100 for freebsd-arch@FreeBSD.org; Sat, 20 Oct 2012 08:48:47 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 20 Oct 2012 08:48:47 +0300 From: Konstantin Belousov To: freebsd-arch@FreeBSD.org Subject: Re: using SSE2 in kernel C code (improving AES-NI module) Message-ID: <20121020054847.GB35915@deviant.kiev.zoral.com.ua> References: <20121019233833.GS1967@funkthat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="E+zqmlIEIVYE0XqN" Content-Disposition: inline In-Reply-To: <20121019233833.GS1967@funkthat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Oct 2012 05:49:00 -0000 --E+zqmlIEIVYE0XqN Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Oct 19, 2012 at 04:38:33PM -0700, John-Mark Gurney wrote: > So, the AES-NI module already uses SSE2 instructions, but it does so > only in assembly. I have improved the perofrmance of the AES-NI > modules implementation, but this involves me using additional SSE2 > instructions. >=20 > In order to keep my sanity, I did part of the new code in C using > gcc native types and xmmintrin.h, but we do not support this header in > the kernel.. This means we cannot simply add the new code to the > kernel... >=20 > Any good ideas on how to integrate this code into the kernel build? >=20 > I have used the trick of producing assembly of the C file with gcc -S, > and then compiling the assembly into the kernel, but I'm not sure if > that's the best way, and even if it is the best, how I'd do the > generation as part of the kernel build... Or would it be ok to commit > both, and require a regeneration each time the C file is updated? >=20 > In my testing in userland w/o the opencrypto framework overhead, the old > code would only get about ~250MB/sec.. With the new code I get > ~2200MB/sec... >=20 > Sample code: > static inline __m128i > xts_crank_lfsr(__m128i inp) > { > const __m128i alphamask =3D _mm_set_epi32(1, 1, 1, AES_XTS_ALPHA); > __m128i xtweak, ret; >=20 > /* set up xor mask */ > xtweak =3D _mm_shuffle_epi32(inp, 0x93); > xtweak =3D _mm_srai_epi32(xtweak, 31); > xtweak &=3D alphamask; >=20 > /* next term */ > ret =3D _mm_slli_epi32(inp, 1); > ret ^=3D xtweak; >=20 > return ret; > } The current structure of the aes-ni driver is partly enforced by the issue you noted. We cannot use sse intristics in the kernel, and huge inline assembler fragments are hard to write. I prefer to have the separate .S files with the optimized code, hand-written. If needed, I offer you a help with transition. I would need a full patch to rewrite the code. --E+zqmlIEIVYE0XqN Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlCCOz8ACgkQC3+MBN1Mb4h/EgCcDyMBlXwl3CpOPrOLMTt1x4yG 29QAn30b9pBDFFEwI6M7HcLx36HWq6GI =a4fj -----END PGP SIGNATURE----- --E+zqmlIEIVYE0XqN--