From owner-svn-src-projects@freebsd.org Thu Nov 5 00:50:26 2015 Return-Path: Delivered-To: svn-src-projects@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D1282A256CD for ; Thu, 5 Nov 2015 00:50:26 +0000 (UTC) (envelope-from np@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8EBC8122A; Thu, 5 Nov 2015 00:50:26 +0000 (UTC) (envelope-from np@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id tA50oPs7048771; Thu, 5 Nov 2015 00:50:25 GMT (envelope-from np@FreeBSD.org) Received: (from np@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id tA50oPd9048767; Thu, 5 Nov 2015 00:50:25 GMT (envelope-from np@FreeBSD.org) Message-Id: <201511050050.tA50oPd9048767@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: np set sender to np@FreeBSD.org using -f From: Navdeep Parhar Date: Thu, 5 Nov 2015 00:50:25 +0000 (UTC) To: src-committers@freebsd.org, svn-src-projects@freebsd.org Subject: svn commit: r290376 - projects/cxl_iscsi/sys/dev/cxgbe/tom X-SVN-Group: projects MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Nov 2015 00:50:27 -0000 Author: np Date: Thu Nov 5 00:50:25 2015 New Revision: 290376 URL: https://svnweb.freebsd.org/changeset/base/290376 Log: cxgbe/tom: redo the TOM bits that support the iSCSI driver. - There is no reason to have a special case for iSCSI in t4_rcvd. Either there is data in the socket buffer (from when the connection was plain TOE, before being promoted to ulp_mode iSCSI) and sbused _should_ be taken into account, or sbused is 0 and doesn't affect the calculation of rx_credits. - write_tx_wr doesn't need special handling for iSCSI either. Its caller should specify the ulp_submode. - Replace t4_ulp_push_frames with t4_push_pdus that can deal with PDUs in an mbufq hanging off the toepcb. This eliminates the "backwards" calls from t4_tom's tx into the iSCSI driver. - The iSCSI driver installs a handler for RX_ISCSI_DDP already and the iSCSI handler for RX_DATA_DDP is identical to the one for RX_ISCSI_DDP. Take advantage of this to eliminate the last remaining "backwards" call from do_rx_data_ddp into the iSCSI driver. - Eliminate the CXGBE_ISCSI_MBUF_TAG abomination. - For tx, it makes no sense to allocate an mbuf tag just to stash 2 bits worth of information. Use a spare byte from the mbuf header instead. - For rx, the per-connection ulpcb is a more natural place to keep information about the PDU currently being assembled. Modified: projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_cpl_io.c projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_ddp.c projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_tom.c projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_tom.h Modified: projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_cpl_io.c ============================================================================== --- projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_cpl_io.c Wed Nov 4 23:52:19 2015 (r290375) +++ projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_cpl_io.c Thu Nov 5 00:50:25 2015 (r290376) @@ -1,5 +1,5 @@ /*- - * Copyright (c) 2012 Chelsio Communications, Inc. + * Copyright (c) 2012, 2015 Chelsio Communications, Inc. * All rights reserved. * Written by: Navdeep Parhar * @@ -71,33 +71,6 @@ VNET_DECLARE(int, tcp_autorcvbuf_inc); VNET_DECLARE(int, tcp_autorcvbuf_max); #define V_tcp_autorcvbuf_max VNET(tcp_autorcvbuf_max) -/* - * For ULP connections HW may add headers, e.g., for digests, that aren't part - * of the messages sent by the host but that are part of the TCP payload and - * therefore consume TCP sequence space. Tx connection parameters that - * operate in TCP sequence space are affected by the HW additions and need to - * compensate for them to accurately track TCP sequence numbers. This array - * contains the compensating extra lengths for ULP packets. It is indexed by - * a packet's ULP submode. - */ -const unsigned int t4_ulp_extra_len[] = {0, 4, 4, 8}; - -/* - * Return the length of any HW additions that will be made to a Tx packet. - * Such additions can happen for some types of ULP packets. - */ -static inline unsigned int -ulp_extra_len(struct mbuf *m, int *ulp_mode) -{ - struct m_tag *mtag; - - if ((mtag = m_tag_find(m, CXGBE_ISCSI_MBUF_TAG, NULL)) == NULL) - return (0); - *ulp_mode = *((int *)(mtag + 1)); - - return (t4_ulp_extra_len[*ulp_mode & 3]); -} - void send_flowc_wr(struct toepcb *toep, struct flowc_tx_params *ftxp) { @@ -383,13 +356,10 @@ t4_rcvd(struct toedev *tod, struct tcpcb KASSERT(toep->sb_cc >= sbused(sb), ("%s: sb %p has more data (%d) than last time (%d).", __func__, sb, sbused(sb), toep->sb_cc)); - if (toep->ulp_mode == ULP_MODE_ISCSI) { - toep->rx_credits += toep->sb_cc; - toep->sb_cc = 0; - } else { - toep->rx_credits += toep->sb_cc - sbused(sb); - toep->sb_cc = sbused(sb); - } + + toep->rx_credits += toep->sb_cc - sbused(sb); + toep->sb_cc = sbused(sb); + if (toep->rx_credits > 0 && (tp->rcv_wnd <= 32 * 1024 || toep->rx_credits >= 64 * 1024 || (toep->rx_credits >= 16 * 1024 && tp->rcv_wnd <= 128 * 1024) || @@ -489,25 +459,16 @@ max_dsgl_nsegs(int tx_credits) static inline void write_tx_wr(void *dst, struct toepcb *toep, unsigned int immdlen, - unsigned int plen, uint8_t credits, int shove, int ulp_mode, int txalign) + unsigned int plen, uint8_t credits, int shove, int ulp_submode, int txalign) { struct fw_ofld_tx_data_wr *txwr = dst; - unsigned int wr_ulp_mode; txwr->op_to_immdlen = htobe32(V_WR_OP(FW_OFLD_TX_DATA_WR) | V_FW_WR_IMMDLEN(immdlen)); txwr->flowid_len16 = htobe32(V_FW_WR_FLOWID(toep->tid) | V_FW_WR_LEN16(credits)); - - /* for iscsi, the mode & submode setting is per-packet */ - if (toep->ulp_mode == ULP_MODE_ISCSI) - wr_ulp_mode = V_TX_ULP_MODE(ulp_mode >> 4) | - V_TX_ULP_SUBMODE(ulp_mode & 3); - else - wr_ulp_mode = V_TX_ULP_MODE(toep->ulp_mode); - - txwr->lsodisable_to_flags = htobe32(wr_ulp_mode | V_TX_URG(0) | /*XXX*/ - V_TX_SHOVE(shove)); + txwr->lsodisable_to_flags = htobe32(V_TX_ULP_MODE(toep->ulp_mode) | + V_TX_ULP_SUBMODE(ulp_submode) | V_TX_URG(0) | V_TX_SHOVE(shove)); txwr->plen = htobe32(plen); if (txalign > 0) { @@ -801,59 +762,67 @@ t4_push_frames(struct adapter *sc, struc close_conn(sc, toep); } -void (*cxgbei_fw4_ack)(struct toepcb *, int); -struct mbuf *(*cxgbei_writeq_len)(struct toepcb *, int *); -struct mbuf *(*cxgbei_writeq_next)(struct toepcb *); +static inline void +rqdrop_locked(struct mbufq *q, int plen) +{ + struct mbuf *m; + + while (plen > 0) { + m = mbufq_dequeue(q); + + /* Too many credits. */ + MPASS(m != NULL); + M_ASSERTPKTHDR(m); + + /* Partial credits. */ + MPASS(plen >= m->m_pkthdr.len); + + plen -= m->m_pkthdr.len; + m_freem(m); + } +} -/* Send ULP data over TOE using TX_DATA_WR. We send whole mbuf at once */ void -t4_ulp_push_frames(struct adapter *sc, struct toepcb *toep, int drop) +t4_push_pdus(struct adapter *sc, struct toepcb *toep, int drop) { - struct mbuf *sndptr, *m = NULL; + struct mbuf *sndptr, *m; struct fw_ofld_tx_data_wr *txwr; struct wrqe *wr; - unsigned int plen, nsegs, credits, max_imm, max_nsegs, max_nsegs_1mbuf; + u_int plen, nsegs, credits, max_imm, max_nsegs, max_nsegs_1mbuf; + u_int adjusted_plen, ulp_submode; struct inpcb *inp = toep->inp; - struct tcpcb *tp; - struct socket *so; - struct sockbuf *sb; - int tx_credits, ulp_len = 0, ulp_mode = 0, qlen = 0; - int shove, compl; - struct ofld_tx_sdesc *txsd; + struct tcpcb *tp = intotcpcb(inp); + int tx_credits, shove; + struct ofld_tx_sdesc *txsd = &toep->txsd[toep->txsd_pidx]; + struct mbufq *pduq = &toep->ulp_pduq; + static const u_int ulp_extra_len[] = {0, 4, 4, 8}; INP_WLOCK_ASSERT(inp); - if (toep->flags & TPF_ABORT_SHUTDOWN) - return; - - tp = intotcpcb(inp); - so = inp->inp_socket; - sb = &so->so_snd; - txsd = &toep->txsd[toep->txsd_pidx]; - KASSERT(toep->flags & TPF_FLOWC_WR_SENT, ("%s: flowc_wr not sent for tid %u.", __func__, toep->tid)); + KASSERT(toep->ulp_mode == ULP_MODE_ISCSI, + ("%s: ulp_mode %u for toep %p", __func__, toep->ulp_mode, toep)); /* * This function doesn't resume by itself. Someone else must clear the * flag and call this function. */ - if (__predict_false(toep->flags & TPF_TX_SUSPENDED)) + if (__predict_false(toep->flags & TPF_TX_SUSPENDED)) { + KASSERT(drop == 0, + ("%s: drop (%d) != 0 but tx is suspended", __func__, drop)); return; + } - sndptr = cxgbei_writeq_len(toep, &qlen); - if (!qlen) - return; + if (drop) + rqdrop_locked(&toep->ulp_pdu_reclaimq, drop); + + while ((sndptr = mbufq_first(pduq)) != NULL) { + M_ASSERTPKTHDR(sndptr); - do { tx_credits = min(toep->tx_credits, MAX_OFLD_TX_CREDITS); max_imm = max_imm_payload(tx_credits); max_nsegs = max_dsgl_nsegs(tx_credits); - if (drop) { - cxgbei_fw4_ack(toep, drop); - drop = 0; - } - plen = 0; nsegs = 0; max_nsegs_1mbuf = 0; /* max # of SGL segments in any one mbuf */ @@ -863,7 +832,10 @@ t4_ulp_push_frames(struct adapter *sc, s nsegs += n; plen += m->m_len; - /* This mbuf sent us _over_ the nsegs limit, return */ + /* + * This mbuf would send us _over_ the nsegs limit. + * Suspend tx because the PDU can't be sent out. + */ if (plen > max_imm && nsegs > max_nsegs) { toep->flags |= TPF_TX_SUSPENDED; return; @@ -871,30 +843,35 @@ t4_ulp_push_frames(struct adapter *sc, s if (max_nsegs_1mbuf < n) max_nsegs_1mbuf = n; - - /* This mbuf put us right at the max_nsegs limit */ - if (plen > max_imm && nsegs == max_nsegs) { - toep->flags |= TPF_TX_SUSPENDED; - return; - } - } - - shove = m == NULL && !(tp->t_flags & TF_MORETOCOME); - /* nothing to send */ - if (plen == 0) { - KASSERT(m == NULL, - ("%s: nothing to send, but m != NULL", __func__)); - break; } if (__predict_false(toep->flags & TPF_FIN_SENT)) panic("%s: excess tx.", __func__); - ulp_len = plen + ulp_extra_len(sndptr, &ulp_mode); + /* + * We have a PDU to send. All of it goes out in one WR so 'm' + * is NULL. A PDU's length is always a multiple of 4. + */ + MPASS(m == NULL); + MPASS((plen & 3) == 0); + MPASS(sndptr->m_pkthdr.len == plen); + + shove = !(tp->t_flags & TF_MORETOCOME); + ulp_submode = mbuf_ulp_submode(sndptr); + MPASS(ulp_submode < nitems(ulp_extra_len)); + + /* + * plen doesn't include header and data digests, which are + * generated and inserted in the right places by the TOE, but + * they do occupy TCP sequence space and need to be accounted + * for. + */ + adjusted_plen = plen + ulp_extra_len[ulp_submode]; if (plen <= max_imm) { /* Immediate data tx */ - wr = alloc_wrqe(roundup(sizeof(*txwr) + plen, 16), + + wr = alloc_wrqe(roundup2(sizeof(*txwr) + plen, 16), toep->ofld_txq); if (wr == NULL) { /* XXX: how will we recover from this? */ @@ -903,16 +880,17 @@ t4_ulp_push_frames(struct adapter *sc, s } txwr = wrtod(wr); credits = howmany(wr->wr_len, 16); - write_tx_wr(txwr, toep, plen, ulp_len, credits, shove, - ulp_mode, 0); + write_tx_wr(txwr, toep, plen, adjusted_plen, credits, + shove, ulp_submode, sc->tt.tx_align); m_copydata(sndptr, 0, plen, (void *)(txwr + 1)); + nsegs = 0; } else { int wr_len; /* DSGL tx */ wr_len = sizeof(*txwr) + sizeof(struct ulptx_sgl) + ((3 * (nsegs - 1)) / 2 + ((nsegs - 1) & 1)) * 8; - wr = alloc_wrqe(roundup(wr_len, 16), toep->ofld_txq); + wr = alloc_wrqe(roundup2(wr_len, 16), toep->ofld_txq); if (wr == NULL) { /* XXX: how will we recover from this? */ toep->flags |= TPF_TX_SUSPENDED; @@ -920,8 +898,8 @@ t4_ulp_push_frames(struct adapter *sc, s } txwr = wrtod(wr); credits = howmany(wr_len, 16); - write_tx_wr(txwr, toep, 0, ulp_len, credits, shove, - ulp_mode, 0); + write_tx_wr(txwr, toep, 0, adjusted_plen, credits, + shove, ulp_submode, sc->tt.tx_align); write_tx_sgl(txwr + 1, sndptr, m, nsegs, max_nsegs_1mbuf); if (wr_len & 0xf) { @@ -934,28 +912,26 @@ t4_ulp_push_frames(struct adapter *sc, s KASSERT(toep->tx_credits >= credits, ("%s: not enough credits", __func__)); + m = mbufq_dequeue(pduq); + MPASS(m == sndptr); + mbufq_enqueue(&toep->ulp_pdu_reclaimq, m); + toep->tx_credits -= credits; toep->tx_nocompl += credits; toep->plen_nocompl += plen; if (toep->tx_credits <= toep->tx_total * 3 / 8 && - toep->tx_nocompl >= toep->tx_total / 4) - compl = 1; - - if (compl) { + toep->tx_nocompl >= toep->tx_total / 4) { txwr->op_to_immdlen |= htobe32(F_FW_WR_COMPL); toep->tx_nocompl = 0; toep->plen_nocompl = 0; } - tp->snd_nxt += ulp_len; - tp->snd_max += ulp_len; - /* goto next mbuf */ - sndptr = m = cxgbei_writeq_next(toep); + tp->snd_nxt += adjusted_plen; + tp->snd_max += adjusted_plen; toep->flags |= TPF_TX_DATA_SENT; - if (toep->tx_credits < MIN_OFLD_TX_CREDITS) { + if (toep->tx_credits < MIN_OFLD_TX_CREDITS) toep->flags |= TPF_TX_SUSPENDED; - } KASSERT(toep->txsd_avail > 0, ("%s: no txsd", __func__)); txsd->plen = plen; @@ -968,10 +944,10 @@ t4_ulp_push_frames(struct adapter *sc, s toep->txsd_avail--; t4_l2t_send(sc, wr, toep->l2te); - } while (m != NULL); + } - /* Send a FIN if requested, but only if there's no more data to send */ - if (m == NULL && toep->flags & TPF_SEND_FIN) + /* Send a FIN if requested, but only if there are no more PDUs to send */ + if (mbufq_first(pduq) == NULL && toep->flags & TPF_SEND_FIN) close_conn(sc, toep); } @@ -990,7 +966,7 @@ t4_tod_output(struct toedev *tod, struct KASSERT(toep != NULL, ("%s: toep is NULL", __func__)); if (toep->ulp_mode == ULP_MODE_ISCSI) - t4_ulp_push_frames(sc, toep, 0); + t4_push_pdus(sc, toep, 0); else t4_push_frames(sc, toep, 0); @@ -1014,7 +990,7 @@ t4_send_fin(struct toedev *tod, struct t toep->flags |= TPF_SEND_FIN; if (tp->t_state >= TCPS_ESTABLISHED) { if (toep->ulp_mode == ULP_MODE_ISCSI) - t4_ulp_push_frames(sc, toep, 0); + t4_push_pdus(sc, toep, 0); else t4_push_frames(sc, toep, 0); } @@ -1653,7 +1629,7 @@ do_fw4_ack(struct sge_iq *iq, const stru toep->tx_credits >= toep->tx_total / 4) { toep->flags &= ~TPF_TX_SUSPENDED; if (toep->ulp_mode == ULP_MODE_ISCSI) - t4_ulp_push_frames(sc, toep, plen); + t4_push_pdus(sc, toep, plen); else t4_push_frames(sc, toep, plen); } else if (plen > 0) { Modified: projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_ddp.c ============================================================================== --- projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_ddp.c Wed Nov 4 23:52:19 2015 (r290375) +++ projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_ddp.c Thu Nov 5 00:50:25 2015 (r290376) @@ -505,8 +505,6 @@ handle_ddp_close(struct toepcb *toep, st F_DDP_INVALID_TAG | F_DDP_COLOR_ERR | F_DDP_TID_MISMATCH |\ F_DDP_INVALID_PPOD | F_DDP_HDRCRC_ERR | F_DDP_DATACRC_ERR) -void (*cxgbei_rx_data_ddp)(struct toepcb *, const struct cpl_rx_data_ddp *); - static int do_rx_data_ddp(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m) { @@ -528,9 +526,9 @@ do_rx_data_ddp(struct sge_iq *iq, const } if (toep->ulp_mode == ULP_MODE_ISCSI) { - cxgbei_rx_data_ddp(toep, cpl); + sc->cpl_handler[CPL_RX_ISCSI_DDP](iq, rss, m); return (0); - } + } handle_ddp_data(toep, cpl->u.ddp_report, cpl->seq, be16toh(cpl->len)); Modified: projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_tom.c ============================================================================== --- projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_tom.c Wed Nov 4 23:52:19 2015 (r290375) +++ projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_tom.c Thu Nov 5 00:50:25 2015 (r290376) @@ -37,6 +37,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -157,6 +158,8 @@ alloc_toepcb(struct port_info *pi, int t toep->ofld_txq = &sc->sge.ofld_txq[txqid]; toep->ofld_rxq = &sc->sge.ofld_rxq[rxqid]; toep->ctrlq = &sc->sge.ctrlq[pi->port_id]; + mbufq_init(&toep->ulp_pduq, INT_MAX); + mbufq_init(&toep->ulp_pdu_reclaimq, INT_MAX); toep->txsd_total = txsd_total; toep->txsd_avail = txsd_total; toep->txsd_pidx = 0; @@ -272,6 +275,14 @@ release_offload_resources(struct toepcb CTR5(KTR_CXGBE, "%s: toep %p (tid %d, l2te %p, ce %p)", __func__, toep, tid, toep->l2te, toep->ce); + /* + * These queues should have been emptied at approximately the same time + * that a normal connection's socket's so_snd would have been purged or + * drained. Do _not_ clean up here. + */ + MPASS(mbufq_len(&toep->ulp_pduq) == 0); + MPASS(mbufq_len(&toep->ulp_pdu_reclaimq) == 0); + if (toep->ulp_mode == ULP_MODE_TCPDDP) release_ddp_resources(toep); Modified: projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_tom.h ============================================================================== --- projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_tom.h Wed Nov 4 23:52:19 2015 (r290375) +++ projects/cxl_iscsi/sys/dev/cxgbe/tom/t4_tom.h Thu Nov 5 00:50:25 2015 (r290376) @@ -1,5 +1,5 @@ /*- - * Copyright (c) 2012 Chelsio Communications, Inc. + * Copyright (c) 2012, 2015 Chelsio Communications, Inc. * All rights reserved. * Written by: Navdeep Parhar * @@ -116,6 +116,8 @@ struct toepcb { u_int ulp_mode; /* ULP mode */ void *ulpcb; + struct mbufq ulp_pduq; /* PDUs waiting to be sent out. */ + struct mbufq ulp_pdu_reclaimq; u_int ddp_flags; struct ddp_buffer *db[2]; @@ -221,6 +223,22 @@ td_adapter(struct tom_data *td) return (td->tod.tod_softc); } +static inline void +set_mbuf_ulp_submode(struct mbuf *m, uint8_t ulp_submode) +{ + + M_ASSERTPKTHDR(m); + m->m_pkthdr.PH_per.eight[0] = ulp_submode; +} + +static inline uint8_t +mbuf_ulp_submode(struct mbuf *m) +{ + + M_ASSERTPKTHDR(m); + return (m->m_pkthdr.PH_per.eight[0]); +} + /* t4_tom.c */ struct toepcb *alloc_toepcb(struct port_info *, int, int, int); void free_toepcb(struct toepcb *); @@ -276,6 +294,7 @@ int t4_send_rst(struct toedev *, struct void t4_set_tcb_field(struct adapter *, struct toepcb *, int, uint16_t, uint64_t, uint64_t); void t4_push_frames(struct adapter *sc, struct toepcb *toep, int drop); +void t4_push_pdus(struct adapter *sc, struct toepcb *toep, int drop); /* t4_ddp.c */ void t4_init_ddp(struct adapter *, struct tom_data *); @@ -288,7 +307,4 @@ void handle_ddp_close(struct toepcb *, s uint32_t); void insert_ddp_data(struct toepcb *, uint32_t); -/* ULP related */ -#define CXGBE_ISCSI_MBUF_TAG 50 -void t4_ulp_push_frames(struct adapter *sc, struct toepcb *toep, int); #endif