From owner-svn-src-head@freebsd.org Sat Jul 1 21:18:07 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F16B6D92FE5; Sat, 1 Jul 2017 21:18:07 +0000 (UTC) (envelope-from allanjude@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BD73E731B2; Sat, 1 Jul 2017 21:18:07 +0000 (UTC) (envelope-from allanjude@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v61LI6as018110; Sat, 1 Jul 2017 21:18:06 GMT (envelope-from allanjude@FreeBSD.org) Received: (from allanjude@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v61LI61I018108; Sat, 1 Jul 2017 21:18:06 GMT (envelope-from allanjude@FreeBSD.org) Message-Id: <201707012118.v61LI61I018108@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: allanjude set sender to allanjude@FreeBSD.org using -f From: Allan Jude Date: Sat, 1 Jul 2017 21:18:06 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r320554 - in head: lib/libmd sys/modules/crypto X-SVN-Group: head X-SVN-Commit-Author: allanjude X-SVN-Commit-Paths: in head: lib/libmd sys/modules/crypto X-SVN-Commit-Revision: 320554 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jul 2017 21:18:08 -0000 Author: allanjude Date: Sat Jul 1 21:18:06 2017 New Revision: 320554 URL: https://svnweb.freebsd.org/changeset/base/320554 Log: Increase loop unrolling for skein hashes This patch was inspired by an opposite change made to shrink the code for the boot loader. On my i7-4770, it increases the skein1024 speed from 470 to 550 MB/s Reviewed by: sbruno MFC after: 1 month Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D7824 Modified: head/lib/libmd/Makefile head/sys/modules/crypto/Makefile Modified: head/lib/libmd/Makefile ============================================================================== --- head/lib/libmd/Makefile Sat Jul 1 20:25:22 2017 (r320553) +++ head/lib/libmd/Makefile Sat Jul 1 21:18:06 2017 (r320554) @@ -88,6 +88,8 @@ sys/md5.h: ${SRCTOP}/sys/${.TARGET} .NOMETA CFLAGS+= -I${.CURDIR} -I${SRCTOP}/sys/crypto/sha2 CFLAGS+= -I${SRCTOP}/sys/crypto/skein CFLAGS+= -DWEAK_REFS +# unroll the 256 and 512 loops, half unroll the 1024 +CFLAGS+= -DSKEIN_LOOP=995 .PATH: ${.CURDIR}/${MACHINE_ARCH} ${SRCTOP}/sys/crypto/sha2 .PATH: ${SRCTOP}/sys/crypto/skein ${SRCTOP}/sys/crypto/skein/${MACHINE_ARCH} @@ -101,6 +103,8 @@ CFLAGS+= -DRMD160_ASM .endif .if exists(${MACHINE_ARCH}/skein_block_asm.s) AFLAGS += --strip-local-absolute +# Fully unroll all loops in the assembly optimized version +AFLAGS+= --defsym SKEIN_LOOP=0 SRCS+= skein_block_asm.s CFLAGS+= -DSKEIN_ASM -DSKEIN_USE_ASM=1792 # list of block functions to replace with assembly: 256+512+1024 = 1792 .endif Modified: head/sys/modules/crypto/Makefile ============================================================================== --- head/sys/modules/crypto/Makefile Sat Jul 1 20:25:22 2017 (r320553) +++ head/sys/modules/crypto/Makefile Sat Jul 1 21:18:06 2017 (r320554) @@ -19,11 +19,15 @@ SRCS += camellia.c camellia-api.c SRCS += des_ecb.c des_enc.c des_setkey.c SRCS += sha1.c sha256c.c sha512c.c SRCS += skein.c skein_block.c +# unroll the 256 and 512 loops, half unroll the 1024 +CFLAGS+= -DSKEIN_LOOP=995 .if exists(${MACHINE_ARCH}/skein_block_asm.s) .PATH: ${SRCTOP}/sys/crypto/skein/${MACHINE_ARCH} SRCS += skein_block_asm.s CFLAGS += -DSKEIN_ASM -DSKEIN_USE_ASM=1792 # list of block functions to replace with assembly: 256+512+1024 = 1792 ACFLAGS += -DELF -Wa,--noexecstack +# Fully unroll all loops in the assembly optimized version +AFLAGS+= --defsym SKEIN_LOOP=0 .endif SRCS += siphash.c SRCS += gmac.c gfmult.c