Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Mar 2019 23:16:10 +0000 (UTC)
From:      John Baldwin <jhb@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org
Subject:   svn commit: r345040 - in stable/11: share/man/man4 sys/conf sys/dev/cxgbe sys/dev/cxgbe/crypto sys/dev/cxgbe/firmware sys/modules/cxgbe sys/modules/cxgbe/ccr sys/powerpc/conf
Message-ID:  <201903112316.x2BNGA4b092507@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: jhb
Date: Mon Mar 11 23:16:10 2019
New Revision: 345040
URL: https://svnweb.freebsd.org/changeset/base/345040

Log:
  MFC 318429,318967,319721,319723,323600,323724,328353-328361,330042,343056:
  Add a driver for the Chelsio T6 crypto accelerator engine.
  
  Note that with the set of commits in this batch, no additional tunables
  are needed to use the driver once it is loaded.
  
  318429:
  Add a driver for the Chelsio T6 crypto accelerator engine.
  
  The ccr(4) driver supports use of the crypto accelerator engine on
  Chelsio T6 NICs in "lookaside" mode via the opencrypto framework.
  
  Currently, the driver supports AES-CBC, AES-CTR, AES-GCM, and AES-XTS
  cipher algorithms as well as the SHA1-HMAC, SHA2-256-HMAC, SHA2-384-HMAC,
  and SHA2-512-HMAC authentication algorithms.  The driver also supports
  chaining one of AES-CBC, AES-CTR, or AES-XTS with an authentication
  algorithm for encrypt-then-authenticate operations.
  
  Note that this driver is still under active development and testing and
  may not yet be ready for production use.  It does pass the tests in
  tests/sys/opencrypto with the exception that the AES-GCM implementation
  in the driver does not yet support requests with a zero byte payload.
  
  To use this driver currently, the "uwire" configuration must be used
  along with explicitly enabling support for lookaside crypto capabilities
  in the cxgbe(4) driver.  These can be done by setting the following
  tunables before loading the cxgbe(4) driver:
  
      hw.cxgbe.config_file=uwire
      hw.cxgbe.cryptocaps_allowed=-1
  
  318967:
  Fail large requests with EFBIG.
  
  The adapter firmware in general does not accept PDUs larger than 64k - 1
  bytes in size.  Sending crypto requests larger than this size result in
  hangs or incorrect output, so reject them with EFBIG.  For requests
  chaining an AES cipher with an HMAC, the firmware appears to require
  slightly smaller requests (around 512 bytes).
  
  319721:
  Add explicit handling for requests with an empty payload.
  
  - For HMAC requests, construct a special input buffer to request an empty
    hash result.
  - For plain cipher requests and requests that chain an AES cipher with an
    HMAC, fail with EINVAL if there is no cipher payload.  If needed in
    the future, chained requests that only contain AAD could be serviced as
    HMAC-only requests.
  - For GCM requests, the hardware does not support generating the tag for
    an AAD-only request.  Instead, complete these requests synchronously
    in software on the assumption that such requests are rare.
  
  319723:
  Fix the software fallback for GCM to validate the existing tag for decrypts.
  
  323600:
  Fix some incorrect sysctl pointers for some error stats.
  
  The bad_session, sglist_error, and process_error sysctl nodes were
  returning the value of the pad_error node instead of the appropriate
  error counters.
  
  323724:
  Enable support for lookaside crypto operations by default.
  
  This permits ccr(4) to be used with the default firmware configuration
  file.
  
  328353:
  Always store the IV in the immediate portion of a work request.
  
  Combined authentication-encryption and GCM requests already stored the
  IV in the immediate explicitly.  This extends this behavior to block
  cipher requests to work around a firmware bug.  While here, simplify
  the AEAD and GCM handlers to not include always-true conditions.
  
  328354:
  Always set the IV location to IV_NOP.
  
  The firmware ignores this field in the FW_CRYPTO_LOOKASIDE_WR work
  request.
  
  328355:
  Reject requests with AAD and IV larger than 511 bytes.
  
  The T6 crypto engine's control messages only support a total AAD
  length (including the prefixed IV) of 511 bytes.  Reject requests with
  large AAD rather than returning incorrect results.
  
  328356:
  Don't discard AAD and IV output data for AEAD requests.
  
  The T6 can hang when processing certain AEAD requests if the request
  sets a flag asking the crypto engine to discard the input IV and AAD
  rather than copying them into the output buffer.  The existing driver
  always discards the IV and AAD as we do not need it.  As a workaround,
  allocate a single "dummy" buffer when the ccr driver attaches and
  change all AEAD requests to write the IV and AAD to this scratch
  buffer.  The contents of the scratch buffer are never used (similar to
  "bogus_page"), and it is ok for multiple in-flight requests to share
  this dummy buffer.
  
  328357:
  Fail crypto requests when the resulting work request is too large.
  
  Most crypto requests will not trigger this condition, but a request
  with a highly-fragmented data buffer (and a resulting "large" S/G
  list) could trigger it.
  
  328358:
  Clamp DSGL entries to a length of 2KB.
  
  This works around an issue in the T6 that can result in DMA engine
  stalls if an error occurs while processing a DSGL entry with a length
  larger than 2KB.
  
  328359:
  Expand the software fallback for GCM to cover more cases.
  
  - Extend ccr_gcm_soft() to handle requests with a non-empty payload.
    While here, switch to allocating the GMAC context instead of placing
    it on the stack since it is over 1KB in size.
  - Allow ccr_gcm() to return a special error value (EMSGSIZE) which
    triggers a fallback to ccr_gcm_soft().  Move the existing empty
    payload check into ccr_gcm() and change a few other cases
    (e.g. large AAD) to fallback to software via EMSGSIZE as well.
  - Add a new 'sw_fallback' stat to count the number of requests
    processed via the software fallback.
  
  328360:
  Don't read or generate an IV until all error checking is complete.
  
  In particular, this avoids edge cases where a generated IV might be
  written into the output buffer even though the request is failed with
  an error.
  
  328361:
  Store IV in output buffer in GCM software fallback when requested.
  
  Properly honor the lack of the CRD_F_IV_PRESENT flag in the GCM
  software fallback case for encryption requests.
  
  330042:
  Don't overflow the ipad[] array when clearing the remainder.
  
  After the auth key is copied into the ipad[] array, any remaining bytes
  are cleared to zero (in case the key is shorter than one block size).
  The full block size was used as the length of the zero rather than the
  size of the remaining ipad[].  In practice this overflow was harmless as
  it could only clear bytes in the following opad[] array which is
  initialized with a copy of ipad[] in the next statement.
  
  343056:
  Reject new sessions if the necessary queues aren't initialized.
  
  ccr reuses the control queue and first rx queue from the first port on
  each adapter.  The driver cannot send requests until those queues are
  initialized.  Refuse to create sessions for now if the queues aren't
  ready.  This is a workaround until cxgbe allocates one or more
  dedicated queues for ccr.
  
  Relnotes:	yes
  Sponsored by:	Chelsio Communications

Added:
  stable/11/share/man/man4/ccr.4
     - copied unchanged from r318429, head/share/man/man4/ccr.4
  stable/11/sys/dev/cxgbe/crypto/
     - copied from r318429, head/sys/dev/cxgbe/crypto/
  stable/11/sys/modules/cxgbe/ccr/
     - copied from r318429, head/sys/modules/cxgbe/ccr/
Modified:
  stable/11/share/man/man4/Makefile
  stable/11/share/man/man4/cxgbe.4
  stable/11/sys/conf/NOTES
  stable/11/sys/conf/files
  stable/11/sys/dev/cxgbe/adapter.h
  stable/11/sys/dev/cxgbe/crypto/t4_crypto.c
  stable/11/sys/dev/cxgbe/firmware/t6fw_cfg.txt
  stable/11/sys/dev/cxgbe/t4_main.c
  stable/11/sys/modules/cxgbe/Makefile
  stable/11/sys/powerpc/conf/NOTES
Directory Properties:
  stable/11/   (props changed)

Modified: stable/11/share/man/man4/Makefile
==============================================================================
--- stable/11/share/man/man4/Makefile	Mon Mar 11 22:48:51 2019	(r345039)
+++ stable/11/share/man/man4/Makefile	Mon Mar 11 23:16:10 2019	(r345040)
@@ -101,6 +101,7 @@ MAN=	aac.4 \
 	cc_newreno.4 \
 	cc_vegas.4 \
 	${_ccd.4} \
+	ccr.4 \
 	cd.4 \
 	cdce.4 \
 	cfi.4 \

Copied: stable/11/share/man/man4/ccr.4 (from r318429, head/share/man/man4/ccr.4)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ stable/11/share/man/man4/ccr.4	Mon Mar 11 23:16:10 2019	(r345040, copy of r318429, head/share/man/man4/ccr.4)
@@ -0,0 +1,110 @@
+.\" Copyright (c) 2017, Chelsio Inc
+.\" All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\"    notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\"    notice, this list of conditions and the following disclaimer in the
+.\"    documentation and/or other materials provided with the distribution.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.\" $FreeBSD$
+.\"
+.Dd May 16, 2017
+.Dt CCR 4
+.Os
+.Sh NAME
+.Nm ccr
+.Nd "Chelsio T6 crypto accelerator driver"
+.Sh SYNOPSIS
+To compile this driver into the kernel,
+place the following lines in your
+kernel configuration file:
+.Bd -ragged -offset indeunt
+.Cd "device ccr"
+.Ed
+.Pp
+To load the driver as a
+module at boot time, place the following line in
+.Xr loader.conf 5 :
+.Bd -literal -offset indent
+ccr_load="YES"
+.Ed
+.Sh DESCRIPTION
+The
+.Nm
+driver provides support for the crypto accelerator engine included on
+PCI Express Ethernet adapters based on the Chelsio Terminator 6 ASIC (T6).
+The driver accelerates AES-CBC, AES-CTR, AES-GCM, AES-XTS, SHA1-HMAC,
+SHA2-256-HMAC, SHA2-384-HMAC, and SHA2-512-HMAC operations for
+.Xr crypto 4
+and
+.Xr ipsec 4 .
+The driver also supports chaining one of AES-CBC, AES-CTR, or AES-XTS with
+SHA1-HMAC, SHA2-256-HMAC, SHA2-384-HMAC, or SHA2-512-HMAC for
+encrypt-then-authenticate operations.
+For further hardware information and questions related to hardware
+requirements, see
+.Pa http://www.chelsio.com/ .
+.Pp
+The
+.Nm
+driver attaches as a child of an existing Chelsio NIC device and thus
+requires that the
+.Xr cxgbe 4
+driver be active.
+.Sh HARDWARE
+The
+.Nm
+driver supports the crypto accelerator engine included on adapters
+based on the T6 ASIC:
+.Pp
+.Bl -bullet -compact
+.It
+Chelsio T6225-CR
+.It
+Chelsio T6225-SO-CR
+.It
+Chelsio T62100-LP-CR
+.It
+Chelsio T62100-SO-CR
+.It
+Chelsio T62100-CR
+.El
+.Sh SUPPORT
+For general information and support,
+go to the Chelsio support website at:
+.Pa http://www.chelsio.com/ .
+.Pp
+If an issue is identified with this driver with a supported adapter,
+email all the specific information related to the issue to
+.Aq Mt support@chelsio.com .
+.Sh SEE ALSO
+.Xr crypto 4 ,
+.Xr cxgbe 4 ,
+.Xr ipsec 4
+.Sh HISTORY
+The
+.Nm
+device driver first appeared in
+.Fx 12.0 .
+.Sh AUTHORS
+.An -nosplit
+The
+.Nm
+driver was written by
+.An John Baldwin Aq Mt jhb@FreeBSD.org .

Modified: stable/11/share/man/man4/cxgbe.4
==============================================================================
--- stable/11/share/man/man4/cxgbe.4	Mon Mar 11 22:48:51 2019	(r345039)
+++ stable/11/share/man/man4/cxgbe.4	Mon Mar 11 23:16:10 2019	(r345040)
@@ -363,6 +363,7 @@ email all the specific information related to the issu
 .Sh SEE ALSO
 .Xr altq 4 ,
 .Xr arp 4 ,
+.Xr ccr 4 ,
 .Xr cxgb 4 ,
 .Xr cxgbev 4 ,
 .Xr netintro 4 ,

Modified: stable/11/sys/conf/NOTES
==============================================================================
--- stable/11/sys/conf/NOTES	Mon Mar 11 22:48:51 2019	(r345039)
+++ stable/11/sys/conf/NOTES	Mon Mar 11 23:16:10 2019	(r345040)
@@ -2884,6 +2884,8 @@ device		cryptodev	# /dev/crypto for access to h/w
 
 device		rndtest		# FIPS 140-2 entropy tester
 
+device		ccr		# Chelsio T6
+
 device		hifn		# Hifn 7951, 7781, etc.
 options 	HIFN_DEBUG	# enable debugging support: hw.hifn.debug
 options 	HIFN_RNDTEST	# enable rndtest support

Modified: stable/11/sys/conf/files
==============================================================================
--- stable/11/sys/conf/files	Mon Mar 11 22:48:51 2019	(r345039)
+++ stable/11/sys/conf/files	Mon Mar 11 23:16:10 2019	(r345040)
@@ -1456,6 +1456,8 @@ t6fw.fw			optional cxgbe					\
 	compile-with	"${NORMAL_FW}"					\
 	no-obj no-implicit-rule						\
 	clean		"t6fw.fw"
+dev/cxgbe/crypto/t4_crypto.c	optional ccr \
+	compile-with "${NORMAL_C} -I$S/dev/cxgbe"
 dev/cy/cy.c			optional cy
 dev/cy/cy_isa.c			optional cy isa
 dev/cy/cy_pci.c			optional cy pci

Modified: stable/11/sys/dev/cxgbe/adapter.h
==============================================================================
--- stable/11/sys/dev/cxgbe/adapter.h	Mon Mar 11 22:48:51 2019	(r345039)
+++ stable/11/sys/dev/cxgbe/adapter.h	Mon Mar 11 23:16:10 2019	(r345040)
@@ -802,6 +802,7 @@ struct adapter {
 	struct iw_tunables iwt;
 	void *iwarp_softc;	/* (struct c4iw_dev *) */
 	void *iscsi_ulp_softc;	/* (struct cxgbei_data *) */
+	void *ccr_softc;	/* (struct ccr_softc *) */
 	struct l2t_data *l2t;	/* L2 table */
 	struct tid_info tids;
 

Modified: stable/11/sys/dev/cxgbe/crypto/t4_crypto.c
==============================================================================
--- head/sys/dev/cxgbe/crypto/t4_crypto.c	Wed May 17 22:13:07 2017	(r318429)
+++ stable/11/sys/dev/cxgbe/crypto/t4_crypto.c	Mon Mar 11 23:16:10 2019	(r345040)
@@ -111,12 +111,26 @@ __FBSDID("$FreeBSD$");
  */
 
 /*
- * The documentation for CPL_RX_PHYS_DSGL claims a maximum of 32
- * SG entries.
+ * The crypto engine supports a maximum AAD size of 511 bytes.
  */
+#define	MAX_AAD_LEN		511
+
+/*
+ * The documentation for CPL_RX_PHYS_DSGL claims a maximum of 32 SG
+ * entries.  While the CPL includes a 16-bit length field, the T6 can
+ * sometimes hang if an error occurs while processing a request with a
+ * single DSGL entry larger than 2k.
+ */
 #define	MAX_RX_PHYS_DSGL_SGE	32
-#define	DSGL_SGE_MAXLEN		65535
+#define	DSGL_SGE_MAXLEN		2048
 
+/*
+ * The adapter only supports requests with a total input or output
+ * length of 64k-1 or smaller.  Longer requests either result in hung
+ * requests or incorrect results.
+ */
+#define	MAX_REQUEST_SIZE	65535
+
 static MALLOC_DEFINE(M_CCR, "ccr", "Chelsio T6 crypto");
 
 struct ccr_session_hmac {
@@ -178,6 +192,13 @@ struct ccr_softc {
 	struct sglist *sg_ulptx;
 	struct sglist *sg_dsgl;
 
+	/*
+	 * Pre-allocate a dummy output buffer for the IV and AAD for
+	 * AEAD requests.
+	 */
+	char *iv_aad_buf;
+	struct sglist *sg_iv_aad;
+
 	/* Statistics. */
 	uint64_t stats_blkcipher_encrypt;
 	uint64_t stats_blkcipher_decrypt;
@@ -193,6 +214,7 @@ struct ccr_softc {
 	uint64_t stats_bad_session;
 	uint64_t stats_sglist_error;
 	uint64_t stats_process_error;
+	uint64_t stats_sw_fallback;
 };
 
 /*
@@ -358,7 +380,7 @@ ccr_use_imm_data(u_int transhdr_len, u_int input_len)
 static void
 ccr_populate_wreq(struct ccr_softc *sc, struct chcr_wr *crwr, u_int kctx_len,
     u_int wr_len, uint32_t sid, u_int imm_len, u_int sgl_len, u_int hash_size,
-    u_int iv_loc, struct cryptop *crp)
+    struct cryptop *crp)
 {
 	u_int cctx_size;
 
@@ -376,7 +398,7 @@ ccr_populate_wreq(struct ccr_softc *sc, struct chcr_wr
 	    V_FW_CRYPTO_LOOKASIDE_WR_RX_CHID(sc->tx_channel_id) |
 	    V_FW_CRYPTO_LOOKASIDE_WR_LCB(0) |
 	    V_FW_CRYPTO_LOOKASIDE_WR_PHASH(0) |
-	    V_FW_CRYPTO_LOOKASIDE_WR_IV(iv_loc) |
+	    V_FW_CRYPTO_LOOKASIDE_WR_IV(IV_NOP) |
 	    V_FW_CRYPTO_LOOKASIDE_WR_FQIDX(0) |
 	    V_FW_CRYPTO_LOOKASIDE_WR_TX_CH(0) |
 	    V_FW_CRYPTO_LOOKASIDE_WR_RX_Q_ID(sc->rxq->iq.abs_id));
@@ -412,6 +434,12 @@ ccr_hmac(struct ccr_softc *sc, uint32_t sid, struct cc
 	u_int imm_len, iopad_size;
 	int error, sgl_nsegs, sgl_len;
 
+	crd = crp->crp_desc;
+
+	/* Reject requests with too large of an input buffer. */
+	if (crd->crd_len > MAX_REQUEST_SIZE)
+		return (EFBIG);
+
 	axf = s->hmac.auth_hash;
 
 	/* PADs must be 128-bit aligned. */
@@ -425,8 +453,11 @@ ccr_hmac(struct ccr_softc *sc, uint32_t sid, struct cc
 	hash_size_in_response = axf->hashsize;
 	transhdr_len = HASH_TRANSHDR_SIZE(kctx_len);
 
-	crd = crp->crp_desc;
-	if (ccr_use_imm_data(transhdr_len, crd->crd_len)) {
+	if (crd->crd_len == 0) {
+		imm_len = axf->blocksize;
+		sgl_nsegs = 0;
+		sgl_len = 0;
+	} else if (ccr_use_imm_data(transhdr_len, crd->crd_len)) {
 		imm_len = crd->crd_len;
 		sgl_nsegs = 0;
 		sgl_len = 0;
@@ -442,6 +473,8 @@ ccr_hmac(struct ccr_softc *sc, uint32_t sid, struct cc
 	}
 
 	wr_len = roundup2(transhdr_len, 16) + roundup2(imm_len, 16) + sgl_len;
+	if (wr_len > SGE_MAX_WR_LEN)
+		return (EFBIG);
 	wr = alloc_wrqe(wr_len, sc->txq);
 	if (wr == NULL) {
 		sc->stats_wr_nomem++;
@@ -451,7 +484,7 @@ ccr_hmac(struct ccr_softc *sc, uint32_t sid, struct cc
 	memset(crwr, 0, wr_len);
 
 	ccr_populate_wreq(sc, crwr, kctx_len, wr_len, sid, imm_len, sgl_len,
-	    hash_size_in_response, IV_NOP, crp);
+	    hash_size_in_response, crp);
 
 	/* XXX: Hardcodes SGE loopback channel of 0. */
 	crwr->sec_cpl.op_ivinsrtofst = htobe32(
@@ -461,7 +494,8 @@ ccr_hmac(struct ccr_softc *sc, uint32_t sid, struct cc
 	    V_CPL_TX_SEC_PDU_CPLLEN(2) | V_CPL_TX_SEC_PDU_PLACEHOLDER(0) |
 	    V_CPL_TX_SEC_PDU_IVINSRTOFST(0));
 
-	crwr->sec_cpl.pldlen = htobe32(crd->crd_len);
+	crwr->sec_cpl.pldlen = htobe32(crd->crd_len == 0 ? axf->blocksize :
+	    crd->crd_len);
 
 	crwr->sec_cpl.cipherstop_lo_authinsert = htobe32(
 	    V_CPL_TX_SEC_PDU_AUTHSTART(1) | V_CPL_TX_SEC_PDU_AUTHSTOP(0));
@@ -474,7 +508,8 @@ ccr_hmac(struct ccr_softc *sc, uint32_t sid, struct cc
 	    V_SCMD_AUTH_MODE(s->hmac.auth_mode) |
 	    V_SCMD_HMAC_CTRL(CHCR_SCMD_HMAC_CTRL_NO_TRUNC));
 	crwr->sec_cpl.ivgen_hdrlen = htobe32(
-	    V_SCMD_LAST_FRAG(0) | V_SCMD_MORE_FRAGS(0) | V_SCMD_MAC_ONLY(1));
+	    V_SCMD_LAST_FRAG(0) |
+	    V_SCMD_MORE_FRAGS(crd->crd_len == 0 ? 1 : 0) | V_SCMD_MAC_ONLY(1));
 
 	memcpy(crwr->key_ctx.key, s->hmac.ipad, s->hmac.partial_digest_len);
 	memcpy(crwr->key_ctx.key + iopad_size, s->hmac.opad,
@@ -488,7 +523,11 @@ ccr_hmac(struct ccr_softc *sc, uint32_t sid, struct cc
 	    V_KEY_CONTEXT_MK_SIZE(s->hmac.mk_size) | V_KEY_CONTEXT_VALID(1));
 
 	dst = (char *)(crwr + 1) + kctx_len + DUMMY_BYTES;
-	if (imm_len != 0)
+	if (crd->crd_len == 0) {
+		dst[0] = 0x80;
+		*(uint64_t *)(dst + axf->blocksize - sizeof(uint64_t)) =
+		    htobe64(axf->blocksize << 3);
+	} else if (imm_len != 0)
 		crypto_copydata(crp->crp_flags, crp->crp_buf, crd->crd_skip,
 		    crd->crd_len, dst);
 	else
@@ -524,7 +563,7 @@ ccr_blkcipher(struct ccr_softc *sc, uint32_t sid, stru
 	struct wrqe *wr;
 	struct cryptodesc *crd;
 	char *dst;
-	u_int iv_loc, kctx_len, key_half, op_type, transhdr_len, wr_len;
+	u_int kctx_len, key_half, op_type, transhdr_len, wr_len;
 	u_int imm_len;
 	int dsgl_nsegs, dsgl_len;
 	int sgl_nsegs, sgl_len;
@@ -532,32 +571,21 @@ ccr_blkcipher(struct ccr_softc *sc, uint32_t sid, stru
 
 	crd = crp->crp_desc;
 
-	if (s->blkcipher.key_len == 0)
+	if (s->blkcipher.key_len == 0 || crd->crd_len == 0)
 		return (EINVAL);
 	if (crd->crd_alg == CRYPTO_AES_CBC &&
 	    (crd->crd_len % AES_BLOCK_LEN) != 0)
 		return (EINVAL);
 
-	iv_loc = IV_NOP;
-	if (crd->crd_flags & CRD_F_ENCRYPT) {
+	/* Reject requests with too large of an input buffer. */
+	if (crd->crd_len > MAX_REQUEST_SIZE)
+		return (EFBIG);
+
+	if (crd->crd_flags & CRD_F_ENCRYPT)
 		op_type = CHCR_ENCRYPT_OP;
-		if (crd->crd_flags & CRD_F_IV_EXPLICIT)
-			memcpy(iv, crd->crd_iv, s->blkcipher.iv_len);
-		else
-			arc4rand(iv, s->blkcipher.iv_len, 0);
-		iv_loc = IV_IMMEDIATE;
-		if ((crd->crd_flags & CRD_F_IV_PRESENT) == 0)
-			crypto_copyback(crp->crp_flags, crp->crp_buf,
-			    crd->crd_inject, s->blkcipher.iv_len, iv);
-	} else {
+	else
 		op_type = CHCR_DECRYPT_OP;
-		if (crd->crd_flags & CRD_F_IV_EXPLICIT) {
-			memcpy(iv, crd->crd_iv, s->blkcipher.iv_len);
-			iv_loc = IV_IMMEDIATE;
-		} else
-			iv_loc = IV_DSGL;
-	}
-
+	
 	sglist_reset(sc->sg_dsgl);
 	error = sglist_append_sglist(sc->sg_dsgl, sc->sg_crp, crd->crd_skip,
 	    crd->crd_len);
@@ -575,22 +603,11 @@ ccr_blkcipher(struct ccr_softc *sc, uint32_t sid, stru
 	if (ccr_use_imm_data(transhdr_len, crd->crd_len +
 	    s->blkcipher.iv_len)) {
 		imm_len = crd->crd_len;
-		if (iv_loc == IV_DSGL) {
-			crypto_copydata(crp->crp_flags, crp->crp_buf,
-			    crd->crd_inject, s->blkcipher.iv_len, iv);
-			iv_loc = IV_IMMEDIATE;
-		}
 		sgl_nsegs = 0;
 		sgl_len = 0;
 	} else {
 		imm_len = 0;
 		sglist_reset(sc->sg_ulptx);
-		if (iv_loc == IV_DSGL) {
-			error = sglist_append_sglist(sc->sg_ulptx, sc->sg_crp,
-			    crd->crd_inject, s->blkcipher.iv_len);
-			if (error)
-				return (error);
-		}
 		error = sglist_append_sglist(sc->sg_ulptx, sc->sg_crp,
 		    crd->crd_skip, crd->crd_len);
 		if (error)
@@ -599,9 +616,10 @@ ccr_blkcipher(struct ccr_softc *sc, uint32_t sid, stru
 		sgl_len = ccr_ulptx_sgl_len(sgl_nsegs);
 	}
 
-	wr_len = roundup2(transhdr_len, 16) + roundup2(imm_len, 16) + sgl_len;
-	if (iv_loc == IV_IMMEDIATE)
-		wr_len += s->blkcipher.iv_len;
+	wr_len = roundup2(transhdr_len, 16) + s->blkcipher.iv_len +
+	    roundup2(imm_len, 16) + sgl_len;
+	if (wr_len > SGE_MAX_WR_LEN)
+		return (EFBIG);
 	wr = alloc_wrqe(wr_len, sc->txq);
 	if (wr == NULL) {
 		sc->stats_wr_nomem++;
@@ -610,8 +628,29 @@ ccr_blkcipher(struct ccr_softc *sc, uint32_t sid, stru
 	crwr = wrtod(wr);
 	memset(crwr, 0, wr_len);
 
+	/*
+	 * Read the existing IV from the request or generate a random
+	 * one if none is provided.  Optionally copy the generated IV
+	 * into the output buffer if requested.
+	 */
+	if (op_type == CHCR_ENCRYPT_OP) {
+		if (crd->crd_flags & CRD_F_IV_EXPLICIT)
+			memcpy(iv, crd->crd_iv, s->blkcipher.iv_len);
+		else
+			arc4rand(iv, s->blkcipher.iv_len, 0);
+		if ((crd->crd_flags & CRD_F_IV_PRESENT) == 0)
+			crypto_copyback(crp->crp_flags, crp->crp_buf,
+			    crd->crd_inject, s->blkcipher.iv_len, iv);
+	} else {
+		if (crd->crd_flags & CRD_F_IV_EXPLICIT)
+			memcpy(iv, crd->crd_iv, s->blkcipher.iv_len);
+		else
+			crypto_copydata(crp->crp_flags, crp->crp_buf,
+			    crd->crd_inject, s->blkcipher.iv_len, iv);
+	}
+
 	ccr_populate_wreq(sc, crwr, kctx_len, wr_len, sid, imm_len, sgl_len, 0,
-	    iv_loc, crp);
+	    crp);
 
 	/* XXX: Hardcodes SGE loopback channel of 0. */
 	crwr->sec_cpl.op_ivinsrtofst = htobe32(
@@ -674,10 +713,8 @@ ccr_blkcipher(struct ccr_softc *sc, uint32_t sid, stru
 	dst = (char *)(crwr + 1) + kctx_len;
 	ccr_write_phys_dsgl(sc, dst, dsgl_nsegs);
 	dst += sizeof(struct cpl_rx_phys_dsgl) + dsgl_len;
-	if (iv_loc == IV_IMMEDIATE) {
-		memcpy(dst, iv, s->blkcipher.iv_len);
-		dst += s->blkcipher.iv_len;
-	}
+	memcpy(dst, iv, s->blkcipher.iv_len);
+	dst += s->blkcipher.iv_len;
 	if (imm_len != 0)
 		crypto_copydata(crp->crp_flags, crp->crp_buf, crd->crd_skip,
 		    crd->crd_len, dst);
@@ -729,7 +766,7 @@ ccr_authenc(struct ccr_softc *sc, uint32_t sid, struct
 	struct wrqe *wr;
 	struct auth_hash *axf;
 	char *dst;
-	u_int iv_loc, kctx_len, key_half, op_type, transhdr_len, wr_len;
+	u_int kctx_len, key_half, op_type, transhdr_len, wr_len;
 	u_int hash_size_in_response, imm_len, iopad_size;
 	u_int aad_start, aad_len, aad_stop;
 	u_int auth_start, auth_stop, auth_insert;
@@ -739,53 +776,65 @@ ccr_authenc(struct ccr_softc *sc, uint32_t sid, struct
 	int sgl_nsegs, sgl_len;
 	int error;
 
-	if (s->blkcipher.key_len == 0)
+	/*
+	 * If there is a need in the future, requests with an empty
+	 * payload could be supported as HMAC-only requests.
+	 */
+	if (s->blkcipher.key_len == 0 || crde->crd_len == 0)
 		return (EINVAL);
 	if (crde->crd_alg == CRYPTO_AES_CBC &&
 	    (crde->crd_len % AES_BLOCK_LEN) != 0)
 		return (EINVAL);
 
 	/*
-	 * AAD is only permitted before the cipher/plain text, not
-	 * after.
+	 * Compute the length of the AAD (data covered by the
+	 * authentication descriptor but not the encryption
+	 * descriptor).  To simplify the logic, AAD is only permitted
+	 * before the cipher/plain text, not after.  This is true of
+	 * all currently-generated requests.
 	 */
 	if (crda->crd_len + crda->crd_skip > crde->crd_len + crde->crd_skip)
 		return (EINVAL);
+	if (crda->crd_skip < crde->crd_skip) {
+		if (crda->crd_skip + crda->crd_len > crde->crd_skip)
+			aad_len = (crde->crd_skip - crda->crd_skip);
+		else
+			aad_len = crda->crd_len;
+	} else
+		aad_len = 0;
+	if (aad_len + s->blkcipher.iv_len > MAX_AAD_LEN)
+		return (EINVAL);
 
 	axf = s->hmac.auth_hash;
 	hash_size_in_response = s->hmac.hash_len;
-
-	/*
-	 * The IV is always stored at the start of the buffer even
-	 * though it may be duplicated in the payload.  The crypto
-	 * engine doesn't work properly if the IV offset points inside
-	 * of the AAD region, so a second copy is always required.
-	 */
-	iv_loc = IV_IMMEDIATE;
-	if (crde->crd_flags & CRD_F_ENCRYPT) {
+	if (crde->crd_flags & CRD_F_ENCRYPT)
 		op_type = CHCR_ENCRYPT_OP;
-		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
-			memcpy(iv, crde->crd_iv, s->blkcipher.iv_len);
-		else
-			arc4rand(iv, s->blkcipher.iv_len, 0);
-		if ((crde->crd_flags & CRD_F_IV_PRESENT) == 0)
-			crypto_copyback(crp->crp_flags, crp->crp_buf,
-			    crde->crd_inject, s->blkcipher.iv_len, iv);
-	} else {
+	else
 		op_type = CHCR_DECRYPT_OP;
-		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
-			memcpy(iv, crde->crd_iv, s->blkcipher.iv_len);
-		else
-			crypto_copydata(crp->crp_flags, crp->crp_buf,
-			    crde->crd_inject, s->blkcipher.iv_len, iv);
-	}
 
 	/*
 	 * The output buffer consists of the cipher text followed by
 	 * the hash when encrypting.  For decryption it only contains
 	 * the plain text.
+	 *
+	 * Due to a firmware bug, the output buffer must include a
+	 * dummy output buffer for the IV and AAD prior to the real
+	 * output buffer.
 	 */
+	if (op_type == CHCR_ENCRYPT_OP) {
+		if (s->blkcipher.iv_len + aad_len + crde->crd_len +
+		    hash_size_in_response > MAX_REQUEST_SIZE)
+			return (EFBIG);
+	} else {
+		if (s->blkcipher.iv_len + aad_len + crde->crd_len >
+		    MAX_REQUEST_SIZE)
+			return (EFBIG);
+	}
 	sglist_reset(sc->sg_dsgl);
+	error = sglist_append_sglist(sc->sg_dsgl, sc->sg_iv_aad, 0,
+	    s->blkcipher.iv_len + aad_len);
+	if (error)
+		return (error);
 	error = sglist_append_sglist(sc->sg_dsgl, sc->sg_crp, crde->crd_skip,
 	    crde->crd_len);
 	if (error)
@@ -815,15 +864,25 @@ ccr_authenc(struct ccr_softc *sc, uint32_t sid, struct
 	 * The input buffer consists of the IV, any AAD, and then the
 	 * cipher/plain text.  For decryption requests the hash is
 	 * appended after the cipher text.
+	 *
+	 * The IV is always stored at the start of the input buffer
+	 * even though it may be duplicated in the payload.  The
+	 * crypto engine doesn't work properly if the IV offset points
+	 * inside of the AAD region, so a second copy is always
+	 * required.
 	 */
-	if (crda->crd_skip < crde->crd_skip) {
-		if (crda->crd_skip + crda->crd_len > crde->crd_skip)
-			aad_len = (crde->crd_skip - crda->crd_skip);
-		else
-			aad_len = crda->crd_len;
-	} else
-		aad_len = 0;
 	input_len = aad_len + crde->crd_len;
+
+	/*
+	 * The firmware hangs if sent a request which is a
+	 * bit smaller than MAX_REQUEST_SIZE.  In particular, the
+	 * firmware appears to require 512 - 16 bytes of spare room
+	 * along with the size of the hash even if the hash isn't
+	 * included in the input buffer.
+	 */
+	if (input_len + roundup2(axf->hashsize, 16) + (512 - 16) >
+	    MAX_REQUEST_SIZE)
+		return (EFBIG);
 	if (op_type == CHCR_DECRYPT_OP)
 		input_len += hash_size_in_response;
 	if (ccr_use_imm_data(transhdr_len, s->blkcipher.iv_len + input_len)) {
@@ -887,9 +946,10 @@ ccr_authenc(struct ccr_softc *sc, uint32_t sid, struct
 	else
 		auth_insert = 0;
 
-	wr_len = roundup2(transhdr_len, 16) + roundup2(imm_len, 16) + sgl_len;
-	if (iv_loc == IV_IMMEDIATE)
-		wr_len += s->blkcipher.iv_len;
+	wr_len = roundup2(transhdr_len, 16) + s->blkcipher.iv_len +
+	    roundup2(imm_len, 16) + sgl_len;
+	if (wr_len > SGE_MAX_WR_LEN)
+		return (EFBIG);
 	wr = alloc_wrqe(wr_len, sc->txq);
 	if (wr == NULL) {
 		sc->stats_wr_nomem++;
@@ -898,9 +958,29 @@ ccr_authenc(struct ccr_softc *sc, uint32_t sid, struct
 	crwr = wrtod(wr);
 	memset(crwr, 0, wr_len);
 
+	/*
+	 * Read the existing IV from the request or generate a random
+	 * one if none is provided.  Optionally copy the generated IV
+	 * into the output buffer if requested.
+	 */
+	if (op_type == CHCR_ENCRYPT_OP) {
+		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
+			memcpy(iv, crde->crd_iv, s->blkcipher.iv_len);
+		else
+			arc4rand(iv, s->blkcipher.iv_len, 0);
+		if ((crde->crd_flags & CRD_F_IV_PRESENT) == 0)
+			crypto_copyback(crp->crp_flags, crp->crp_buf,
+			    crde->crd_inject, s->blkcipher.iv_len, iv);
+	} else {
+		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
+			memcpy(iv, crde->crd_iv, s->blkcipher.iv_len);
+		else
+			crypto_copydata(crp->crp_flags, crp->crp_buf,
+			    crde->crd_inject, s->blkcipher.iv_len, iv);
+	}
+
 	ccr_populate_wreq(sc, crwr, kctx_len, wr_len, sid, imm_len, sgl_len,
-	    op_type == CHCR_DECRYPT_OP ? hash_size_in_response : 0, iv_loc,
-	    crp);
+	    op_type == CHCR_DECRYPT_OP ? hash_size_in_response : 0, crp);
 
 	/* XXX: Hardcodes SGE loopback channel of 0. */
 	crwr->sec_cpl.op_ivinsrtofst = htobe32(
@@ -938,7 +1018,7 @@ ccr_authenc(struct ccr_softc *sc, uint32_t sid, struct
 	crwr->sec_cpl.ivgen_hdrlen = htobe32(
 	    V_SCMD_IV_GEN_CTRL(0) |
 	    V_SCMD_MORE_FRAGS(0) | V_SCMD_LAST_FRAG(0) | V_SCMD_MAC_ONLY(0) |
-	    V_SCMD_AADIVDROP(1) | V_SCMD_HDR_LEN(dsgl_len));
+	    V_SCMD_AADIVDROP(0) | V_SCMD_HDR_LEN(dsgl_len));
 
 	crwr->key_ctx.ctx_hdr = s->blkcipher.key_ctx_hdr;
 	switch (crde->crd_alg) {
@@ -974,10 +1054,8 @@ ccr_authenc(struct ccr_softc *sc, uint32_t sid, struct
 	dst = (char *)(crwr + 1) + kctx_len;
 	ccr_write_phys_dsgl(sc, dst, dsgl_nsegs);
 	dst += sizeof(struct cpl_rx_phys_dsgl) + dsgl_len;
-	if (iv_loc == IV_IMMEDIATE) {
-		memcpy(dst, iv, s->blkcipher.iv_len);
-		dst += s->blkcipher.iv_len;
-	}
+	memcpy(dst, iv, s->blkcipher.iv_len);
+	dst += s->blkcipher.iv_len;
 	if (imm_len != 0) {
 		if (aad_len != 0) {
 			crypto_copydata(crp->crp_flags, crp->crp_buf,
@@ -1037,7 +1115,7 @@ ccr_gcm(struct ccr_softc *sc, uint32_t sid, struct ccr
 	struct chcr_wr *crwr;
 	struct wrqe *wr;
 	char *dst;
-	u_int iv_len, iv_loc, kctx_len, op_type, transhdr_len, wr_len;
+	u_int iv_len, kctx_len, op_type, transhdr_len, wr_len;
 	u_int hash_size_in_response, imm_len;
 	u_int aad_start, aad_stop, cipher_start, cipher_stop, auth_insert;
 	u_int hmac_ctrl, input_len;
@@ -1049,63 +1127,71 @@ ccr_gcm(struct ccr_softc *sc, uint32_t sid, struct ccr
 		return (EINVAL);
 
 	/*
+	 * The crypto engine doesn't handle GCM requests with an empty
+	 * payload, so handle those in software instead.
+	 */
+	if (crde->crd_len == 0)
+		return (EMSGSIZE);
+
+	/*
 	 * AAD is only permitted before the cipher/plain text, not
 	 * after.
 	 */
 	if (crda->crd_len + crda->crd_skip > crde->crd_len + crde->crd_skip)
-		return (EINVAL);
+		return (EMSGSIZE);
 
-	hash_size_in_response = s->gmac.hash_len;
+	if (crda->crd_len + AES_BLOCK_LEN > MAX_AAD_LEN)
+		return (EMSGSIZE);
 
-	/*
-	 * The IV is always stored at the start of the buffer even
-	 * though it may be duplicated in the payload.  The crypto
-	 * engine doesn't work properly if the IV offset points inside
-	 * of the AAD region, so a second copy is always required.
-	 *
-	 * The IV for GCM is further complicated in that IPSec
-	 * provides a full 16-byte IV (including the counter), whereas
-	 * the /dev/crypto interface sometimes provides a full 16-byte
-	 * IV (if no IV is provided in the ioctl) and sometimes a
-	 * 12-byte IV (if the IV was explicit).  For now the driver
-	 * always assumes a 12-byte IV and initializes the low 4 byte
-	 * counter to 1.
-	 */
-	iv_loc = IV_IMMEDIATE;
-	if (crde->crd_flags & CRD_F_ENCRYPT) {
+	hash_size_in_response = s->gmac.hash_len;
+	if (crde->crd_flags & CRD_F_ENCRYPT)
 		op_type = CHCR_ENCRYPT_OP;
-		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
-			memcpy(iv, crde->crd_iv, s->blkcipher.iv_len);
-		else
-			arc4rand(iv, s->blkcipher.iv_len, 0);
-		if ((crde->crd_flags & CRD_F_IV_PRESENT) == 0)
-			crypto_copyback(crp->crp_flags, crp->crp_buf,
-			    crde->crd_inject, s->blkcipher.iv_len, iv);
-	} else {
+	else
 		op_type = CHCR_DECRYPT_OP;
-		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
-			memcpy(iv, crde->crd_iv, s->blkcipher.iv_len);
-		else
-			crypto_copydata(crp->crp_flags, crp->crp_buf,
-			    crde->crd_inject, s->blkcipher.iv_len, iv);
-	}
 
 	/*
-	 * If the input IV is 12 bytes, append an explicit counter of
-	 * 1.
+	 * The IV handling for GCM in OCF is a bit more complicated in
+	 * that IPSec provides a full 16-byte IV (including the
+	 * counter), whereas the /dev/crypto interface sometimes
+	 * provides a full 16-byte IV (if no IV is provided in the
+	 * ioctl) and sometimes a 12-byte IV (if the IV was explicit).
+	 *
+	 * When provided a 12-byte IV, assume the IV is really 16 bytes
+	 * with a counter in the last 4 bytes initialized to 1.
+	 *
+	 * While iv_len is checked below, the value is currently
+	 * always set to 12 when creating a GCM session in this driver
+	 * due to limitations in OCF (there is no way to know what the
+	 * IV length of a given request will be).  This means that the
+	 * driver always assumes as 12-byte IV for now.
 	 */
-	if (s->blkcipher.iv_len == 12) {
-		*(uint32_t *)&iv[12] = htobe32(1);
+	if (s->blkcipher.iv_len == 12)
 		iv_len = AES_BLOCK_LEN;
-	} else
+	else
 		iv_len = s->blkcipher.iv_len;
 
 	/*
 	 * The output buffer consists of the cipher text followed by
 	 * the tag when encrypting.  For decryption it only contains
 	 * the plain text.
+	 *
+	 * Due to a firmware bug, the output buffer must include a
+	 * dummy output buffer for the IV and AAD prior to the real
+	 * output buffer.
 	 */
+	if (op_type == CHCR_ENCRYPT_OP) {
+		if (iv_len + crda->crd_len + crde->crd_len +
+		    hash_size_in_response > MAX_REQUEST_SIZE)
+			return (EFBIG);
+	} else {
+		if (iv_len + crda->crd_len + crde->crd_len > MAX_REQUEST_SIZE)
+			return (EFBIG);
+	}
 	sglist_reset(sc->sg_dsgl);
+	error = sglist_append_sglist(sc->sg_dsgl, sc->sg_iv_aad, 0, iv_len +
+	    crda->crd_len);
+	if (error)
+		return (error);
 	error = sglist_append_sglist(sc->sg_dsgl, sc->sg_crp, crde->crd_skip,
 	    crde->crd_len);
 	if (error)
@@ -1132,10 +1218,18 @@ ccr_gcm(struct ccr_softc *sc, uint32_t sid, struct ccr
 	 * The input buffer consists of the IV, any AAD, and then the
 	 * cipher/plain text.  For decryption requests the hash is
 	 * appended after the cipher text.
+	 *
+	 * The IV is always stored at the start of the input buffer
+	 * even though it may be duplicated in the payload.  The
+	 * crypto engine doesn't work properly if the IV offset points
+	 * inside of the AAD region, so a second copy is always
+	 * required.
 	 */
 	input_len = crda->crd_len + crde->crd_len;
 	if (op_type == CHCR_DECRYPT_OP)
 		input_len += hash_size_in_response;
+	if (input_len > MAX_REQUEST_SIZE)
+		return (EFBIG);
 	if (ccr_use_imm_data(transhdr_len, iv_len + input_len)) {
 		imm_len = input_len;
 		sgl_nsegs = 0;
@@ -1180,9 +1274,10 @@ ccr_gcm(struct ccr_softc *sc, uint32_t sid, struct ccr
 	else
 		auth_insert = 0;
 
-	wr_len = roundup2(transhdr_len, 16) + roundup2(imm_len, 16) + sgl_len;
-	if (iv_loc == IV_IMMEDIATE)
-		wr_len += iv_len;
+	wr_len = roundup2(transhdr_len, 16) + iv_len + roundup2(imm_len, 16) +
+	    sgl_len;
+	if (wr_len > SGE_MAX_WR_LEN)
+		return (EFBIG);
 	wr = alloc_wrqe(wr_len, sc->txq);
 	if (wr == NULL) {
 		sc->stats_wr_nomem++;
@@ -1191,8 +1286,34 @@ ccr_gcm(struct ccr_softc *sc, uint32_t sid, struct ccr
 	crwr = wrtod(wr);
 	memset(crwr, 0, wr_len);
 
+	/*
+	 * Read the existing IV from the request or generate a random
+	 * one if none is provided.  Optionally copy the generated IV
+	 * into the output buffer if requested.
+	 *
+	 * If the input IV is 12 bytes, append an explicit 4-byte
+	 * counter of 1.
+	 */
+	if (op_type == CHCR_ENCRYPT_OP) {
+		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
+			memcpy(iv, crde->crd_iv, s->blkcipher.iv_len);
+		else
+			arc4rand(iv, s->blkcipher.iv_len, 0);
+		if ((crde->crd_flags & CRD_F_IV_PRESENT) == 0)
+			crypto_copyback(crp->crp_flags, crp->crp_buf,
+			    crde->crd_inject, s->blkcipher.iv_len, iv);
+	} else {
+		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
+			memcpy(iv, crde->crd_iv, s->blkcipher.iv_len);
+		else
+			crypto_copydata(crp->crp_flags, crp->crp_buf,
+			    crde->crd_inject, s->blkcipher.iv_len, iv);
+	}
+	if (s->blkcipher.iv_len == 12)
+		*(uint32_t *)&iv[12] = htobe32(1);
+
 	ccr_populate_wreq(sc, crwr, kctx_len, wr_len, sid, imm_len, sgl_len,
-	    0, iv_loc, crp);
+	    0, crp);
 
 	/* XXX: Hardcodes SGE loopback channel of 0. */
 	crwr->sec_cpl.op_ivinsrtofst = htobe32(
@@ -1240,7 +1361,7 @@ ccr_gcm(struct ccr_softc *sc, uint32_t sid, struct ccr
 	crwr->sec_cpl.ivgen_hdrlen = htobe32(
 	    V_SCMD_IV_GEN_CTRL(0) |
 	    V_SCMD_MORE_FRAGS(0) | V_SCMD_LAST_FRAG(0) | V_SCMD_MAC_ONLY(0) |
-	    V_SCMD_AADIVDROP(1) | V_SCMD_HDR_LEN(dsgl_len));
+	    V_SCMD_AADIVDROP(0) | V_SCMD_HDR_LEN(dsgl_len));
 
 	crwr->key_ctx.ctx_hdr = s->blkcipher.key_ctx_hdr;
 	memcpy(crwr->key_ctx.key, s->blkcipher.enckey, s->blkcipher.key_len);
@@ -1250,10 +1371,8 @@ ccr_gcm(struct ccr_softc *sc, uint32_t sid, struct ccr
 	dst = (char *)(crwr + 1) + kctx_len;
 	ccr_write_phys_dsgl(sc, dst, dsgl_nsegs);
 	dst += sizeof(struct cpl_rx_phys_dsgl) + dsgl_len;
-	if (iv_loc == IV_IMMEDIATE) {
-		memcpy(dst, iv, iv_len);
-		dst += iv_len;
-	}
+	memcpy(dst, iv, iv_len);
+	dst += iv_len;
 	if (imm_len != 0) {
 		if (crda->crd_len != 0) {
 			crypto_copydata(crp->crp_flags, crp->crp_buf,
@@ -1289,7 +1408,153 @@ ccr_gcm_done(struct ccr_softc *sc, struct ccr_session 
 	return (error);
 }
 
+/*
+ * Handle a GCM request that is not supported by the crypto engine by
+ * performing the operation in software.  Derived from swcr_authenc().
+ */
 static void
+ccr_gcm_soft(struct ccr_session *s, struct cryptop *crp,
+    struct cryptodesc *crda, struct cryptodesc *crde)
+{
+	struct auth_hash *axf;
+	struct enc_xform *exf;
+	void *auth_ctx;
+	uint8_t *kschedule;
+	char block[GMAC_BLOCK_LEN];
+	char digest[GMAC_DIGEST_LEN];
+	char iv[AES_BLOCK_LEN];
+	int error, i, len;
+
+	auth_ctx = NULL;
+	kschedule = NULL;
+
+	/* Initialize the MAC. */
+	switch (s->blkcipher.key_len) {
+	case 16:
+		axf = &auth_hash_nist_gmac_aes_128;
+		break;
+	case 24:
+		axf = &auth_hash_nist_gmac_aes_192;
+		break;
+	case 32:
+		axf = &auth_hash_nist_gmac_aes_256;
+		break;
+	default:
+		error = EINVAL;
+		goto out;
+	}
+	auth_ctx = malloc(axf->ctxsize, M_CCR, M_NOWAIT);
+	if (auth_ctx == NULL) {
+		error = ENOMEM;
+		goto out;
+	}
+	axf->Init(auth_ctx);
+	axf->Setkey(auth_ctx, s->blkcipher.enckey, s->blkcipher.key_len);
+
+	/* Initialize the cipher. */
+	exf = &enc_xform_aes_nist_gcm;
+	error = exf->setkey(&kschedule, s->blkcipher.enckey,
+	    s->blkcipher.key_len);
+	if (error)
+		goto out;
+
+	/*
+	 * This assumes a 12-byte IV from the crp.  See longer comment
+	 * above in ccr_gcm() for more details.
+	 */
+	if (crde->crd_flags & CRD_F_ENCRYPT) {
+		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
+			memcpy(iv, crde->crd_iv, 12);
+		else
+			arc4rand(iv, 12, 0);
+		if ((crde->crd_flags & CRD_F_IV_PRESENT) == 0)
+			crypto_copyback(crp->crp_flags, crp->crp_buf,
+			    crde->crd_inject, 12, iv);
+	} else {
+		if (crde->crd_flags & CRD_F_IV_EXPLICIT)
+			memcpy(iv, crde->crd_iv, 12);
+		else
+			crypto_copydata(crp->crp_flags, crp->crp_buf,
+			    crde->crd_inject, 12, iv);
+	}
+	*(uint32_t *)&iv[12] = htobe32(1);
+
+	axf->Reinit(auth_ctx, iv, sizeof(iv));
+
+	/* MAC the AAD. */
+	for (i = 0; i < crda->crd_len; i += sizeof(block)) {
+		len = imin(crda->crd_len - i, sizeof(block));
+		crypto_copydata(crp->crp_flags, crp->crp_buf, crda->crd_skip +
+		    i, len, block);
+		bzero(block + len, sizeof(block) - len);
+		axf->Update(auth_ctx, block, sizeof(block));
+	}
+
+	exf->reinit(kschedule, iv);
+
+	/* Do encryption with MAC */
+	for (i = 0; i < crde->crd_len; i += sizeof(block)) {
+		len = imin(crde->crd_len - i, sizeof(block));
+		crypto_copydata(crp->crp_flags, crp->crp_buf, crde->crd_skip +
+		    i, len, block);
+		bzero(block + len, sizeof(block) - len);
+		if (crde->crd_flags & CRD_F_ENCRYPT) {
+			exf->encrypt(kschedule, block);
+			axf->Update(auth_ctx, block, len);
+			crypto_copyback(crp->crp_flags, crp->crp_buf,
+			    crde->crd_skip + i, len, block);
+		} else {
+			axf->Update(auth_ctx, block, len);
+		}
+	}
+
+	/* Length block. */
+	bzero(block, sizeof(block));
+	((uint32_t *)block)[1] = htobe32(crda->crd_len * 8);
+	((uint32_t *)block)[3] = htobe32(crde->crd_len * 8);
+	axf->Update(auth_ctx, block, sizeof(block));
+
+	/* Finalize MAC. */
+	axf->Final(digest, auth_ctx);
+
+	/* Inject or validate tag. */
+	if (crde->crd_flags & CRD_F_ENCRYPT) {
+		crypto_copyback(crp->crp_flags, crp->crp_buf, crda->crd_inject,
+		    sizeof(digest), digest);
+		error = 0;
+	} else {
+		char digest2[GMAC_DIGEST_LEN];
+
+		crypto_copydata(crp->crp_flags, crp->crp_buf, crda->crd_inject,
+		    sizeof(digest2), digest2);
+		if (timingsafe_bcmp(digest, digest2, sizeof(digest)) == 0) {
+			error = 0;
+
+			/* Tag matches, decrypt data. */
+			for (i = 0; i < crde->crd_len; i += sizeof(block)) {
+				len = imin(crde->crd_len - i, sizeof(block));
+				crypto_copydata(crp->crp_flags, crp->crp_buf,
+				    crde->crd_skip + i, len, block);
+				bzero(block + len, sizeof(block) - len);
+				exf->decrypt(kschedule, block);
+				crypto_copyback(crp->crp_flags, crp->crp_buf,
+				    crde->crd_skip + i, len, block);
+			}
+		} else
+			error = EBADMSG;
+	}
+
+	exf->zerokey(&kschedule);
+out:
+	if (auth_ctx != NULL) {

*** DIFF OUTPUT TRUNCATED AT 1000 LINES ***



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201903112316.x2BNGA4b092507>