Date: Mon, 27 Feb 2017 15:01:41 +0100 From: Ed Schouten <ed@nuxi.nl> To: Andriy Gapon <avg@freebsd.org> Cc: src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: Re: svn commit: r314335 - stable/10/sys/crypto/sha2 Message-ID: <CABh_MK==nm1rpy0STaECBVF-fT%2B5C5BapDWjkOOqAEeszOBknw@mail.gmail.com> In-Reply-To: <201702271305.v1RD5HOi077424@repo.freebsd.org> References: <201702271305.v1RD5HOi077424@repo.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Andriy, 2017-02-27 14:05 GMT+01:00 Andriy Gapon <avg@freebsd.org>: > +/* Message schedule computation */ > +#define MSCH(W, ii, i) \ > + W[i + ii + 16] = s1(W[i + ii + 14]) + W[i + ii + 9] + s0(W[i + ii + 1]) + W[i + ii] [snip] > uint32_t W[64]; [snip] > + for (i = 0; i < 64; i += 16) { > + RNDr(S, W, 1, i); > + RNDr(S, W, 2, i); > + RNDr(S, W, 3, i); > + RNDr(S, W, 4, i); > + RNDr(S, W, 5, i); > + RNDr(S, W, 6, i); > + RNDr(S, W, 7, i); > + RNDr(S, W, 8, i); > + RNDr(S, W, 9, i); > + RNDr(S, W, 10, i); > + RNDr(S, W, 11, i); > + RNDr(S, W, 12, i); > + RNDr(S, W, 13, i); > + RNDr(S, W, 14, i); > + RNDr(S, W, 15, i); > + > + if (i == 48) > + break; > + MSCH(W, 0, i); > + MSCH(W, 1, i); > + MSCH(W, 2, i); > + MSCH(W, 3, i); > + MSCH(W, 4, i); > + MSCH(W, 5, i); > + MSCH(W, 6, i); > + MSCH(W, 7, i); > + MSCH(W, 8, i); > + MSCH(W, 9, i); > + MSCH(W, 10, i); > + MSCH(W, 11, i); > + MSCH(W, 12, i); > + MSCH(W, 13, i); > + MSCH(W, 14, i); > + MSCH(W, 15, i); > + } Something interesting that I noticed some time ago when comparing the various SHA-{256,512} implementations: there is no need to store the entire extended message in W. During every iteration of this loop, RNDr() and MSCH() never go more than 16 elements back. Say, if you were to modify MSCH() to something like this: > +#define MSCH(W, ii) \ > + W[ii] += s1(W[(ii + 14) % 16]) + W[(ii + 9) % 16] + s0(W[(ii + 1)) % 16]) Then it will compute the next chunk of the extended message in-place. RNDr() must then be adjusted to use W[i] instead of W[i + ii], of course. W then only needs to hold 16 elements instead of 64 or 80. -- Ed Schouten <ed@nuxi.nl> Nuxi, 's-Hertogenbosch, the Netherlands KvK-nr.: 62051717
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABh_MK==nm1rpy0STaECBVF-fT%2B5C5BapDWjkOOqAEeszOBknw>