Date: Tue, 16 Jan 2018 10:50:47 +0100 (CET) From: Emeric POUPON <emeric.poupon@stormshield.eu> To: John Baldwin <jhb@freebsd.org> Cc: arch@freebsd.org Subject: Re: Ranting about OCF / crypto(9) Message-ID: <973472132.3159693.1516096247163.JavaMail.zimbra@stormshield.eu> In-Reply-To: <3790717.UIxaijsHl3@ralph.baldwin.cx> References: <3790717.UIxaijsHl3@ralph.baldwin.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello, > - We need to not treat accelerated software (e.g. AES-NI) as a > hardware interface. Right now OCF's model of priorities when > trying to choose a backend driver for a session only has two > "levels" software vs hardware and aesni(4) (and the ARMv8 variant) > are lumped into the hardware bucket so that they have precedence > over the "dumb" software implementation. However, the accelerated > software algorithms do need some of the same support features of > the "dumb" software implementation (such as being scheduled on a > thread pool to use CPU cycles) that are not needed by other "hardware" > engines. OCF needs to understand this distinction. > > - Somewhat related, we should try to use accelerated software when > possible (e.g. AES-CBC with SHA) doesn't use AES-NI unless the > CPU supports accelerated SHA. Ideally for this case we'd still > use AES-NI for the AES portion along with the software SHA > implementation (and we'd do it one pass over the data rather than > two when possible). Indeed it would make sense to extend the software driver to make use of available software acceleration. From IPSec, this would allow to accelerate the encryption part without accelerating the authentication part, and it is still a very common use case. Actually, we have some patches to do that, maybe it would make sense to try to distribute them? This would require quite a significant amount of work though. > This is all I could think of today. What do other folks think? Well, the batch mode and its queue is questionable. Indeed, when using several hardware drivers, having a single process trying to dispatch the crypto jobs to the drivers and calculating the CRYPTO_HINT_MORE flag sounds inefficient. Maybe we would need a dedicated queue/thread per driver if we really want the batch mode to be effective? Furthermore, hardware drivers often already manage internal queues for jobs. I guess the only benefit of the batch mode would be to allow a lot of crypto requests to be queued in the framework and prevent the consumers to deal with the crypto requests they don't manage to enqueue?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?973472132.3159693.1516096247163.JavaMail.zimbra>