Date: Wed, 14 Jan 2015 10:21:24 +0300 From: rozhuk.im@gmail.com To: "'John-Mark Gurney'" <jmg@funkthat.com> Cc: freebsd-hackers@freebsd.org, freebsd-geom@freebsd.org Subject: RE: ChaCha8/12/20 and GEOM ELI tests Message-ID: <54b618f6.43ac700a.3509.2eae@mx.google.com> In-Reply-To: <20150112233411.GP1949@funkthat.com> References: <54b33bfa.e31b980a.3e5d.ffffc823@mx.google.com> <20150112072249.GM1949@funkthat.com> <54b43144.2d08980a.437b.0f8f@mx.google.com> <20150112233411.GP1949@funkthat.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I've updated the patch. Deleted XTC mode. ChaCha/XChaCha added to GELI. http://netlab.linkpc.net/download/software/FreeBSD/patches/chacha.patch > > > Also, where are the man page diffs? They might have explained the > > > difference between the two, and explained why two versions of > chacha > > > are needed... > > > > No man page diffs. > > You need to document the new defines in crypto(9), and document the > various parameters in crypto(7)... Yes, not all modes are documented > in crypto(7), but going forward, at a minimum we need to document new > additions... > > I'll admit I didn't document the other algorithms as I'm not as familar > w/ those as the ones that I worked one... Agree. > > Man pages does not explain difference between AES-CBC and AES-XTS... > > True, but CBC and XTS (which includes a reference to the standard) are > a lot more searchable/common knowlege than xchacha.. google thinks you > mean chacha, and xchacha just turns up a bunch of people on various > networks... Not until you search on xchacha crypto do you get a > relevant page... Also, wikipedia doesn't have an entry for xchacha, > nor does the chacha (cipher) page list it... So, when documenting > xchacha in crypto(7), include a link to the description/standard... Agree. > > > Is there a reason you decided to write your own ChaCha > > > implementation instead of using one of the standard ones? Did you > > > run performance tests between your implementation and others? > > > > Reference ChaCha and reference (FreeBSD) XTS (4k sector): > > ChaCha8-XTS-256 = 199518722 bytes/sec > > ChaCha12-XTS-256 = 179029849 bytes/sec > > ChaCha20-XTS-256 = 149447317 bytes/sec > > XChaCha8-XTS-256 = 195675728 bytes/sec > > XChaCha12-XTS-256 = 175790196 bytes/sec > > XChaCha20-XTS-256 = 147939263 bytes/sec > > So, you're seeing a 33%-50% improvement, good to hear... > > Also, do you publish this implementation somewhere? If so, it'd be > helpful to include a url to where up to date versions can be > obtained... > If you don't plan on publishing/maintaining it outside of FreeBSD, then > we need to unifdef out the Windows parts of it for our tree... On my own site: http://www.netlab.linkpc.net/download/software/SDK/core/include/chacha.h (working copy) This is not FreeBSD kernel specific, I also test it under Windows - 32 bit and FreeBSD user space. geli (user space) also use this code to encrypt/decrypt password/metadata. > > This is the reference version adapted for use in /dev/crypto. > > chacha_block_unaligneg() - processing the reference version of a data > block. > > Macros are used for readability. > > chacha_block_aligned() - the same but the work on the aligned data. > > Please use the macro __NO_STRICT_ALIGNMENT to decide if special work is > necessary to handle the alignment... I`m already use ALIGNED_POINTER() macro. > What is the CHACHA_X64 macro for? If that is to detect LP64 platforms, > please use the macro __LP64__ to decide this... Have you done > performance evaluations on 32bit arches to make sure double rounds > aren't a benefit there too? __LP64__ - done. I run self test on x32, all passed Ok. No speed degradation. > Use the byteorder(9) macros to encode/decode integers instead of > rolling your own (U8TO32_LITTLE and U32TO8_LITTLE)... Turns out > compilers aren't good at optimizing this type of code, and platforms > may have assembly optimized versions for these... 1. U8TO32_LITTLE / U32TO8_LITTLE can read/write unaligned data. Can htonl() handle unaligned input on arm? 2. On LE systems no conversion required. > > To increase speed, instead of one byte is processed for 4/8 byte > times. > > The data in the context of an 8-byte aligned. > > To increase security, all data, including temporary, saved in a > > context that on completion of the work is filled with zeros. > > Please use the function explicite_bzero that is available for all of > these instead of creating your own.. explicite_bzero() available only in FreeBSD kernel space. I`m use bzero() in chacha_zerokey() / xchacha_zerokey() as all other ***_zerokey() functions in this file. > > Final: > > > > struct aes_xts_ctx { > > rijndael_ctx key1; > > rijndael_ctx key2; > > uint64_t tweak[(AES_XTS_BLOCKSIZE / sizeof(uint64_t))]; }; > > > > void > > aes_xts_reinit(caddr_t key, u_int8_t *iv) { > > struct aes_xts_ctx *ctx = (struct aes_xts_ctx *)key; > > > > /* > > * Prepare tweak as E_k2(IV). IV is specified as LE > representation > > * of a 64-bit block number which we allow to be passed in > directly. > > */ > > if (ALIGNED_POINTER(iv, uint64_t)) { > > ctx->tweak[0] = (*((uint64_t*)(void*)iv)); > > } else { > > bcopy(iv, ctx->tweak, sizeof(uint64_t)); > > } > > /* Convert to LE. */ > > ctx->tweak[0] = htole64(ctx->tweak[0]); > > Hmm... this line bothers me.. I'll need to spend more time reading up > to decide if it is buggy or not... Is ctx->tweak in host order? or LE > order? I believe it's suppose to be LE order, as it gets passed > directly to _encryt.. I'm also not sure if the original code is BE > clean, which is part of my problem... I hope to see an optimized version soon to 10x :)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54b618f6.43ac700a.3509.2eae>