Date: Tue, 23 Oct 2012 00:04:17 -0700 From: John-Mark Gurney <jmg@funkthat.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: freebsd-arch@freebsd.org Subject: Re: using SSE2 in kernel C code (improving AES-NI module) Message-ID: <20121023070417.GD1563@funkthat.com> In-Reply-To: <20121021061011.GG35915@deviant.kiev.zoral.com.ua> References: <20121019233833.GS1967@funkthat.com> <20121020054847.GB35915@deviant.kiev.zoral.com.ua> <20121020171124.GU1967@funkthat.com> <CAGE5yCoM92rU7Ca7C7_x=3vXW%2BqO9Zc0uQhPURuMbstPDvq9yg@mail.gmail.com> <20121021024726.GA1563@funkthat.com> <20121021061011.GG35915@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Konstantin Belousov wrote this message on Sun, Oct 21, 2012 at 09:10 +0300: > On Sat, Oct 20, 2012 at 07:47:26PM -0700, John-Mark Gurney wrote: > > Peter Wemm wrote this message on Sat, Oct 20, 2012 at 11:10 -0700: > > > Or, another option.. do something like genassym or the many other > > > kernel build tools. aicasm builds and runs a userland tool to > > > generate something to build into the kernel. With sufficient > > > cross-contamination safeguards I wonder if something similar might be > > > able to be done here. > > > > Well, looks like I may this working... Turns out I can't name the file > > .s otherwise config puts it in SFILES which causes all sorts of problems.. > > So, I went w/ .nos, does any one else have any suggestions? > > > > how does this look to people: > > aesni_wrap2.nos optional aesni \ > > dependency "$S/crypto/aesni/aesni_wrap2.c" \ > > compile-with "${CC} -O3 -fPIC -S -o aesni_wrap2.nos $S/crypto/aesni/aesni_wrap2.c" \ > > no-obj no-implicit-rule before-depend \ > > clean "aesni_wrap2.nos" > > aesni_wrap2.o optional aesni \ > > dependency "aesni_wrap2.nos" \ > > compile-with "${NORMAL_S} aesni_wrap2.nos" \ > > no-implicit-rule \ > > clean "aesni_wrap2.o" > > > > We'll have to do something similar in the module Makefile, but that is > > easier... > > > > Also, I thought we had a better way to note that some devices depend > > upon others than just throwing a depend error... If you include aesni > > w/o crypto, you get error about missing cryptodev_if.h... > > > Hm, if such thing is possible, why do you need to compile through the > .S at all ? All you need is to specify the special compiling flags, > including -msse and -msse2. Thanks, I managed to get it down to one... > Note, you shall not need -fPIC, at least for amd64. I would suggest to use > -O2, as well as to try to honour the -g settings. If I don't do -fpic I get: aesni_wrap2.o:(.eh_frame+0x20): relocation truncated to fit: R_X86_64_32 against `.text' when linking the kernel... If you can explain to me how to get rid of this error, I'll do it.. > Most likely, you can put the ${CFLAGS} on the command line, followed > by -msse -msse2. I can't use CFLAGS because it removes access to the xmmintrin.h header file... It looks like an option is to use: -fpic ${OPTFLAGS:C/^-O2$/-O3/} ${DEBUG} In my testing, -O2 is significantly slower, hence the bump to -O3: x O2.txt + O3.txt N Min Max Median Avg Stddev x 20 1741.3491 1754.987 1752.9267 1751.5602 3.5616947 + 20 2223.217 2244.4501 2242.7028 2240.3183 5.7020691 Difference at 95.0% confidence 488.758 +/- 3.04271 27.9042% +/- 0.173715% (Student's t, pooled s = 4.75391) Those are MB/sec... Index: files.amd64 =================================================================== --- files.amd64 (revision 241041) +++ files.amd64 (working copy) @@ -137,6 +137,11 @@ crypto/aesni/aeskeys_amd64.S optional aesni crypto/aesni/aesni.c optional aesni crypto/aesni/aesni_wrap.c optional aesni +aesni_wrap2.o optional aesni \ + dependency "$S/crypto/aesni/aesni_wrap2.c" \ + compile-with "${CC} -c -fpic ${COPTFLAGS:C/^-O2$/-O3/} ${DEBUG} -o aesni_wrap2.o $S/crypto/aesni/aesni_wrap2.c" \ + no-implicit-rule \ + clean "aesni_wrap2.o" crypto/blowfish/bf_enc.c optional crypto | ipsec crypto/des/des_enc.c optional crypto | ipsec | netsmb crypto/via/padlock.c optional padlock I still need to fix up i386, and will let people review a full patch to address both arches before committing... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121023070417.GD1563>