Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Jan 2018 10:50:47 +0100 (CET)
From:      Emeric POUPON <emeric.poupon@stormshield.eu>
To:        John Baldwin <jhb@freebsd.org>
Cc:        arch@freebsd.org
Subject:   Re: Ranting about OCF / crypto(9)
Message-ID:  <973472132.3159693.1516096247163.JavaMail.zimbra@stormshield.eu>
In-Reply-To: <3790717.UIxaijsHl3@ralph.baldwin.cx>
References:  <3790717.UIxaijsHl3@ralph.baldwin.cx>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello,

> - We need to not treat accelerated software (e.g. AES-NI) as a
>  hardware interface.  Right now OCF's model of priorities when
>  trying to choose a backend driver for a session only has two
>  "levels" software vs hardware and aesni(4) (and the ARMv8 variant)
>  are lumped into the hardware bucket so that they have precedence
>  over the "dumb" software implementation.  However, the accelerated
>  software algorithms do need some of the same support features of
>  the "dumb" software implementation (such as being scheduled on a
>  thread pool to use CPU cycles) that are not needed by other "hardware"
>  engines.  OCF needs to understand this distinction.
>
> - Somewhat related, we should try to use accelerated software when
>  possible (e.g. AES-CBC with SHA) doesn't use AES-NI unless the
>  CPU supports accelerated SHA.  Ideally for this case we'd still
>  use AES-NI for the AES portion along with the software SHA
>  implementation (and we'd do it one pass over the data rather than
>  two when possible).

Indeed it would make sense to extend the software driver to make use
of available software acceleration. From IPSec, this would allow to
accelerate the encryption part without accelerating the authentication
part, and it is still a very common use case.
Actually, we have some patches to do that, maybe it would make sense
to try to distribute them? This would require quite a significant amount
of work though.
 
> This is all I could think of today.  What do other folks think?

Well, the batch mode and its queue is questionable. Indeed, when using
several hardware drivers, having a single process trying to dispatch the
crypto jobs to the drivers and calculating the CRYPTO_HINT_MORE flag
sounds inefficient. Maybe we would need a dedicated queue/thread per
driver if we really want the batch mode to be effective?
Furthermore, hardware drivers often already manage internal queues for
jobs. I guess the only benefit of the batch mode would be to allow a
lot of crypto requests to be queued in the framework and prevent the
consumers to deal with the crypto requests they don't manage to enqueue?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?973472132.3159693.1516096247163.JavaMail.zimbra>