From owner-svn-src-stable@freebsd.org Fri Jan 15 01:26:34 2016 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 58D23A834BC; Fri, 15 Jan 2016 01:26:34 +0000 (UTC) (envelope-from sbruno@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 18D261BDC; Fri, 15 Jan 2016 01:26:34 +0000 (UTC) (envelope-from sbruno@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id u0F1QX1n022903; Fri, 15 Jan 2016 01:26:33 GMT (envelope-from sbruno@FreeBSD.org) Received: (from sbruno@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id u0F1QXRF022901; Fri, 15 Jan 2016 01:26:33 GMT (envelope-from sbruno@FreeBSD.org) Message-Id: <201601150126.u0F1QXRF022901@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: sbruno set sender to sbruno@FreeBSD.org using -f From: Sean Bruno Date: Fri, 15 Jan 2016 01:26:33 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r294061 - stable/10/sys/dev/ixgbe X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jan 2016 01:26:34 -0000 Author: sbruno Date: Fri Jan 15 01:26:32 2016 New Revision: 294061 URL: https://svnweb.freebsd.org/changeset/base/294061 Log: Multiple MFC for ixgbe -- v 3.1.0 r283883 -- update to 3.1.0 r283893 -- update SRIOV API changes related to future possible MFC of SRIOV work r285590 -- Fix ixgbe(4) SRIOV VF initialization bugs r285591 -- Remove version check for FLOWID r285592 -- Update netmap support for ixgbe SRIOV VFs, needs ixgbe_netmap.h merge r286238 -- Fixup MTU zeroing if INET/INET6 are undefined. Submitted by: kevin bowling (kevin.bowling@kev009.com) Reviewed by: smh Relnotes: Yes Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D4273 Modified: stable/10/sys/dev/ixgbe/if_ix.c stable/10/sys/dev/ixgbe/if_ixv.c stable/10/sys/dev/ixgbe/ix_txrx.c stable/10/sys/dev/ixgbe/ixgbe.h stable/10/sys/dev/ixgbe/ixgbe_mbx.h stable/10/sys/dev/ixgbe/ixgbe_vf.c Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/dev/ixgbe/if_ix.c ============================================================================== --- stable/10/sys/dev/ixgbe/if_ix.c Fri Jan 15 01:22:36 2016 (r294060) +++ stable/10/sys/dev/ixgbe/if_ix.c Fri Jan 15 01:26:32 2016 (r294061) @@ -40,6 +40,11 @@ #include "ixgbe.h" +#ifdef RSS +#include +#include +#endif + /********************************************************************* * Set this to one to display debug statistics *********************************************************************/ @@ -48,7 +53,7 @@ int ixgbe_display_debug_stat /********************************************************************* * Driver version *********************************************************************/ -char ixgbe_driver_version[] = "2.8.3"; +char ixgbe_driver_version[] = "3.1.0"; /********************************************************************* * PCI Device ID Table @@ -132,6 +137,7 @@ static int ixgbe_setup_msix(struct adapt static void ixgbe_free_pci_resources(struct adapter *); static void ixgbe_local_timer(void *); static int ixgbe_setup_interface(device_t, struct adapter *); +static void ixgbe_config_gpie(struct adapter *); static void ixgbe_config_dmac(struct adapter *); static void ixgbe_config_delay_values(struct adapter *); static void ixgbe_config_link(struct adapter *); @@ -200,6 +206,18 @@ static void ixgbe_handle_phy(void *, int static void ixgbe_reinit_fdir(void *, int); #endif +#ifdef PCI_IOV +static void ixgbe_ping_all_vfs(struct adapter *); +static void ixgbe_handle_mbx(void *, int); +static int ixgbe_init_iov(device_t, u16, const nvlist_t *); +static void ixgbe_uninit_iov(device_t); +static int ixgbe_add_vf(device_t, u16, const nvlist_t *); +static void ixgbe_initialize_iov(struct adapter *); +static void ixgbe_recalculate_max_frame(struct adapter *); +static void ixgbe_init_vf(struct adapter *, struct ixgbe_vf *); +#endif /* PCI_IOV */ + + /********************************************************************* * FreeBSD Device Interface Entry Points *********************************************************************/ @@ -212,6 +230,11 @@ static device_method_t ix_methods[] = { DEVMETHOD(device_shutdown, ixgbe_shutdown), DEVMETHOD(device_suspend, ixgbe_suspend), DEVMETHOD(device_resume, ixgbe_resume), +#ifdef PCI_IOV + DEVMETHOD(pci_iov_init, ixgbe_init_iov), + DEVMETHOD(pci_iov_uninit, ixgbe_uninit_iov), + DEVMETHOD(pci_iov_add_vf, ixgbe_add_vf), +#endif /* PCI_IOV */ DEVMETHOD_END }; @@ -224,6 +247,9 @@ DRIVER_MODULE(ix, pci, ix_driver, ix_dev MODULE_DEPEND(ix, pci, 1, 1, 1); MODULE_DEPEND(ix, ether, 1, 1, 1); +#ifdef DEV_NETMAP +MODULE_DEPEND(ix, netmap, 1, 1, 1); +#endif /* DEV_NETMAP */ /* ** TUNEABLE PARAMETERS: @@ -291,8 +317,7 @@ SYSCTL_INT(_hw_ix, OID_AUTO, enable_msix static int ixgbe_num_queues = 0; TUNABLE_INT("hw.ix.num_queues", &ixgbe_num_queues); SYSCTL_INT(_hw_ix, OID_AUTO, num_queues, CTLFLAG_RDTUN, &ixgbe_num_queues, 0, - "Number of queues to configure up to a maximum of 8; " - "0 indicates autoconfigure"); + "Number of queues to configure, 0 indicates autoconfigure"); /* ** Number of TX descriptors per ring, @@ -344,6 +369,8 @@ static int fdir_pballoc = 1; #include #endif /* DEV_NETMAP */ +static MALLOC_DEFINE(M_IXGBE, "ix", "ix driver allocations"); + /********************************************************************* * Device identification routine * @@ -447,6 +474,15 @@ ixgbe_attach(device_t dev) "max number of tx packets to process", &adapter->tx_process_limit, ixgbe_tx_process_limit); + /* Sysctls for limiting the amount of work done in the taskqueues */ + ixgbe_set_sysctl_value(adapter, "rx_processing_limit", + "max number of rx packets to process", + &adapter->rx_process_limit, ixgbe_rx_process_limit); + + ixgbe_set_sysctl_value(adapter, "tx_processing_limit", + "max number of tx packets to process", + &adapter->tx_process_limit, ixgbe_tx_process_limit); + /* Do descriptor calc and sanity checks */ if (((ixgbe_txd * sizeof(union ixgbe_adv_tx_desc)) % DBA_ALIGN) != 0 || ixgbe_txd < MIN_TXD || ixgbe_txd > MAX_TXD) { @@ -484,7 +520,7 @@ ixgbe_attach(device_t dev) } /* Allocate multicast array memory. */ - adapter->mta = malloc(sizeof(u8) * IXGBE_ETH_LENGTH_OF_ADDRESS * + adapter->mta = malloc(sizeof(*adapter->mta) * MAX_NUM_MULTICAST_ADDRESSES, M_DEVBUF, M_NOWAIT); if (adapter->mta == NULL) { device_printf(dev, "Can not allocate multicast setup array\n"); @@ -566,9 +602,32 @@ ixgbe_attach(device_t dev) /* Check PCIE slot type/speed/width */ ixgbe_get_slot_info(hw); + /* Set an initial default flow control value */ adapter->fc = ixgbe_fc_full; +#ifdef PCI_IOV + if ((hw->mac.type != ixgbe_mac_82598EB) && (adapter->msix > 1)) { + nvlist_t *pf_schema, *vf_schema; + + hw->mbx.ops.init_params(hw); + pf_schema = pci_iov_schema_alloc_node(); + vf_schema = pci_iov_schema_alloc_node(); + pci_iov_schema_add_unicast_mac(vf_schema, "mac-addr", 0, NULL); + pci_iov_schema_add_bool(vf_schema, "mac-anti-spoof", + IOV_SCHEMA_HASDEFAULT, TRUE); + pci_iov_schema_add_bool(vf_schema, "allow-set-mac", + IOV_SCHEMA_HASDEFAULT, FALSE); + pci_iov_schema_add_bool(vf_schema, "allow-promisc", + IOV_SCHEMA_HASDEFAULT, FALSE); + error = pci_iov_attach(dev, pf_schema, vf_schema); + if (error != 0) { + device_printf(dev, + "Error %d setting up SR-IOV\n", error); + } + } +#endif /* PCI_IOV */ + /* Check for certain supported features */ ixgbe_check_wol_support(adapter); ixgbe_check_eee_support(adapter); @@ -625,6 +684,13 @@ ixgbe_detach(device_t dev) return (EBUSY); } +#ifdef PCI_IOV + if (pci_iov_detach(dev) != 0) { + device_printf(dev, "SR-IOV in use; detach first.\n"); + return (EBUSY); + } +#endif /* PCI_IOV */ + /* Stop the adapter */ IXGBE_CORE_LOCK(adapter); ixgbe_setup_low_power_mode(adapter); @@ -645,6 +711,9 @@ ixgbe_detach(device_t dev) taskqueue_drain(adapter->tq, &adapter->link_task); taskqueue_drain(adapter->tq, &adapter->mod_task); taskqueue_drain(adapter->tq, &adapter->msf_task); +#ifdef PCI_IOV + taskqueue_drain(adapter->tq, &adapter->mbx_task); +#endif taskqueue_drain(adapter->tq, &adapter->phy_task); #ifdef IXGBE_FDIR taskqueue_drain(adapter->tq, &adapter->fdir_task); @@ -821,6 +890,9 @@ ixgbe_ioctl(struct ifnet * ifp, u_long c adapter->max_frame_size = ifp->if_mtu + IXGBE_MTU_HDR; ixgbe_init_locked(adapter); +#ifdef PCI_IOV + ixgbe_recalculate_max_frame(adapter); +#endif IXGBE_CORE_UNLOCK(adapter); } break; @@ -936,22 +1008,36 @@ ixgbe_init_locked(struct adapter *adapte struct ifnet *ifp = adapter->ifp; device_t dev = adapter->dev; struct ixgbe_hw *hw = &adapter->hw; - u32 k, txdctl, mhadd, gpie; + struct tx_ring *txr; + struct rx_ring *rxr; + u32 txdctl, mhadd; u32 rxdctl, rxctrl; +#ifdef PCI_IOV + enum ixgbe_iov_mode mode; +#endif mtx_assert(&adapter->core_mtx, MA_OWNED); INIT_DEBUGOUT("ixgbe_init_locked: begin"); + hw->adapter_stopped = FALSE; ixgbe_stop_adapter(hw); callout_stop(&adapter->timer); +#ifdef PCI_IOV + mode = ixgbe_get_iov_mode(adapter); + adapter->pool = ixgbe_max_vfs(mode); + /* Queue indices may change with IOV mode */ + for (int i = 0; i < adapter->num_queues; i++) { + adapter->rx_rings[i].me = ixgbe_pf_que_index(mode, i); + adapter->tx_rings[i].me = ixgbe_pf_que_index(mode, i); + } +#endif /* reprogram the RAR[0] in case user changed it. */ - ixgbe_set_rar(hw, 0, adapter->hw.mac.addr, 0, IXGBE_RAH_AV); + ixgbe_set_rar(hw, 0, hw->mac.addr, adapter->pool, IXGBE_RAH_AV); /* Get the latest mac address, User can use a LAA */ - bcopy(IF_LLADDR(adapter->ifp), hw->mac.addr, - IXGBE_ETH_LENGTH_OF_ADDRESS); - ixgbe_set_rar(hw, 0, hw->mac.addr, 0, 1); + bcopy(IF_LLADDR(ifp), hw->mac.addr, IXGBE_ETH_LENGTH_OF_ADDRESS); + ixgbe_set_rar(hw, 0, hw->mac.addr, adapter->pool, 1); hw->addr_ctrl.rar_used_count = 1; /* Set the various hardware offload abilities */ @@ -974,6 +1060,9 @@ ixgbe_init_locked(struct adapter *adapte } ixgbe_init_hw(hw); +#ifdef PCI_IOV + ixgbe_initialize_iov(adapter); +#endif ixgbe_initialize_transmit_units(adapter); /* Setup Multicast table */ @@ -983,14 +1072,10 @@ ixgbe_init_locked(struct adapter *adapte ** Determine the correct mbuf pool ** for doing jumbo frames */ - if (adapter->max_frame_size <= 2048) + if (adapter->max_frame_size <= MCLBYTES) adapter->rx_mbuf_sz = MCLBYTES; - else if (adapter->max_frame_size <= 4096) - adapter->rx_mbuf_sz = MJUMPAGESIZE; - else if (adapter->max_frame_size <= 9216) - adapter->rx_mbuf_sz = MJUM9BYTES; else - adapter->rx_mbuf_sz = MJUM16BYTES; + adapter->rx_mbuf_sz = MJUMPAGESIZE; /* Prepare receive descriptors and buffers */ if (ixgbe_setup_receive_structures(adapter)) { @@ -1002,31 +1087,8 @@ ixgbe_init_locked(struct adapter *adapte /* Configure RX settings */ ixgbe_initialize_receive_units(adapter); - gpie = IXGBE_READ_REG(&adapter->hw, IXGBE_GPIE); - - /* Enable Fan Failure Interrupt */ - gpie |= IXGBE_SDP1_GPIEN_BY_MAC(hw); - - /* Add for Module detection */ - if (hw->mac.type == ixgbe_mac_82599EB) - gpie |= IXGBE_SDP2_GPIEN; - - /* - * Thermal Failure Detection (X540) - * Link Detection (X552) - */ - if (hw->mac.type == ixgbe_mac_X540 || - hw->device_id == IXGBE_DEV_ID_X550EM_X_SFP || - hw->device_id == IXGBE_DEV_ID_X550EM_X_10G_T) - gpie |= IXGBE_SDP0_GPIEN_X540; - - if (adapter->msix > 1) { - /* Enable Enhanced MSIX mode */ - gpie |= IXGBE_GPIE_MSIX_MODE; - gpie |= IXGBE_GPIE_EIAME | IXGBE_GPIE_PBA_SUPPORT | - IXGBE_GPIE_OCD; - } - IXGBE_WRITE_REG(hw, IXGBE_GPIE, gpie); + /* Enable SDP & MSIX interrupts based on adapter */ + ixgbe_config_gpie(adapter); /* Set MTU size */ if (ifp->if_mtu > ETHERMTU) { @@ -1039,7 +1101,8 @@ ixgbe_init_locked(struct adapter *adapte /* Now enable all the queues */ for (int i = 0; i < adapter->num_queues; i++) { - txdctl = IXGBE_READ_REG(hw, IXGBE_TXDCTL(i)); + txr = &adapter->tx_rings[i]; + txdctl = IXGBE_READ_REG(hw, IXGBE_TXDCTL(txr->me)); txdctl |= IXGBE_TXDCTL_ENABLE; /* Set WTHRESH to 8, burst writeback */ txdctl |= (8 << 16); @@ -1051,11 +1114,12 @@ ixgbe_init_locked(struct adapter *adapte * Prefetching enables tx line rate even with 1 queue. */ txdctl |= (32 << 0) | (1 << 8); - IXGBE_WRITE_REG(hw, IXGBE_TXDCTL(i), txdctl); + IXGBE_WRITE_REG(hw, IXGBE_TXDCTL(txr->me), txdctl); } - for (int i = 0; i < adapter->num_queues; i++) { - rxdctl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(i)); + for (int i = 0, j = 0; i < adapter->num_queues; i++) { + rxr = &adapter->rx_rings[i]; + rxdctl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(rxr->me)); if (hw->mac.type == ixgbe_mac_82598EB) { /* ** PTHRESH = 21 @@ -1066,9 +1130,9 @@ ixgbe_init_locked(struct adapter *adapte rxdctl |= 0x080420; } rxdctl |= IXGBE_RXDCTL_ENABLE; - IXGBE_WRITE_REG(hw, IXGBE_RXDCTL(i), rxdctl); - for (k = 0; k < 10; k++) { - if (IXGBE_READ_REG(hw, IXGBE_RXDCTL(i)) & + IXGBE_WRITE_REG(hw, IXGBE_RXDCTL(rxr->me), rxdctl); + for (; j < 10; j++) { + if (IXGBE_READ_REG(hw, IXGBE_RXDCTL(rxr->me)) & IXGBE_RXDCTL_ENABLE) break; else @@ -1097,10 +1161,10 @@ ixgbe_init_locked(struct adapter *adapte struct netmap_kring *kring = &na->rx_rings[i]; int t = na->num_rx_desc - 1 - nm_kr_rxspace(kring); - IXGBE_WRITE_REG(hw, IXGBE_RDT(i), t); + IXGBE_WRITE_REG(hw, IXGBE_RDT(rxr->me), t); } else #endif /* DEV_NETMAP */ - IXGBE_WRITE_REG(hw, IXGBE_RDT(i), adapter->num_rx_desc - 1); + IXGBE_WRITE_REG(hw, IXGBE_RDT(rxr->me), adapter->num_rx_desc - 1); } /* Enable Receive engine */ @@ -1139,9 +1203,9 @@ ixgbe_init_locked(struct adapter *adapte #endif /* - ** Check on any SFP devices that - ** need to be kick-started - */ + * Check on any SFP devices that + * need to be kick-started + */ if (hw->phy.type == ixgbe_phy_none) { int err = hw->phy.ops.identify(hw); if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) { @@ -1155,8 +1219,7 @@ ixgbe_init_locked(struct adapter *adapte IXGBE_WRITE_REG(hw, IXGBE_EITR(adapter->vector), IXGBE_LINK_ITR); /* Configure Energy Efficient Ethernet for supported devices */ - if (adapter->eee_support) - ixgbe_setup_eee(hw, adapter->eee_enabled); + ixgbe_setup_eee(hw, adapter->eee_enabled); /* Config/Enable Link */ ixgbe_config_link(adapter); @@ -1176,6 +1239,15 @@ ixgbe_init_locked(struct adapter *adapte /* And now turn on interrupts */ ixgbe_enable_intr(adapter); +#ifdef PCI_IOV + /* Enable the use of the MBX by the VF's */ + { + u32 reg = IXGBE_READ_REG(hw, IXGBE_CTRL_EXT); + reg |= IXGBE_CTRL_EXT_PFRSTD; + IXGBE_WRITE_REG(hw, IXGBE_CTRL_EXT, reg); + } +#endif + /* Now inform the stack we're ready */ ifp->if_drv_flags |= IFF_DRV_RUNNING; @@ -1194,6 +1266,51 @@ ixgbe_init(void *arg) } static void +ixgbe_config_gpie(struct adapter *adapter) +{ + struct ixgbe_hw *hw = &adapter->hw; + u32 gpie; + + gpie = IXGBE_READ_REG(hw, IXGBE_GPIE); + + /* Fan Failure Interrupt */ + if (hw->device_id == IXGBE_DEV_ID_82598AT) + gpie |= IXGBE_SDP1_GPIEN; + + /* + * Module detection (SDP2) + * Media ready (SDP1) + */ + if (hw->mac.type == ixgbe_mac_82599EB) { + gpie |= IXGBE_SDP2_GPIEN; + if (hw->device_id != IXGBE_DEV_ID_82599_QSFP_SF_QP) + gpie |= IXGBE_SDP1_GPIEN; + } + + /* + * Thermal Failure Detection (X540) + * Link Detection (X557) + */ + if (hw->mac.type == ixgbe_mac_X540 || + hw->device_id == IXGBE_DEV_ID_X550EM_X_SFP || + hw->device_id == IXGBE_DEV_ID_X550EM_X_10G_T) + gpie |= IXGBE_SDP0_GPIEN_X540; + + if (adapter->msix > 1) { + /* Enable Enhanced MSIX mode */ + gpie |= IXGBE_GPIE_MSIX_MODE; + gpie |= IXGBE_GPIE_EIAME | IXGBE_GPIE_PBA_SUPPORT | + IXGBE_GPIE_OCD; + } + + IXGBE_WRITE_REG(hw, IXGBE_GPIE, gpie); + return; +} + +/* + * Requires adapter->max_frame_size to be set. + */ +static void ixgbe_config_delay_values(struct adapter *adapter) { struct ixgbe_hw *hw = &adapter->hw; @@ -1287,10 +1404,9 @@ ixgbe_handle_que(void *context, int pend struct adapter *adapter = que->adapter; struct tx_ring *txr = que->txr; struct ifnet *ifp = adapter->ifp; - bool more; if (ifp->if_drv_flags & IFF_DRV_RUNNING) { - more = ixgbe_rxeof(que); + ixgbe_rxeof(que); IXGBE_TX_LOCK(txr); ixgbe_txeof(txr); #ifndef IXGBE_LEGACY_TX @@ -1352,8 +1468,8 @@ ixgbe_legacy_irq(void *arg) IXGBE_TX_UNLOCK(txr); /* Check for fan failure */ - if ((hw->phy.media_type == ixgbe_media_type_copper) && - (reg_eicr & IXGBE_EICR_GPI_SDP1_BY_MAC(hw))) { + if ((hw->device_id == IXGBE_DEV_ID_82598AT) && + (reg_eicr & IXGBE_EICR_GPI_SDP1)) { device_printf(adapter->dev, "\nCRITICAL: FAN FAILURE!! " "REPLACE IMMEDIATELY!!\n"); IXGBE_WRITE_REG(hw, IXGBE_EIMS, IXGBE_EICR_GPI_SDP1_BY_MAC(hw)); @@ -1392,6 +1508,7 @@ ixgbe_msix_que(void *arg) bool more; u32 newitr = 0; + /* Protect against spurious interrupts */ if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) return; @@ -1515,6 +1632,10 @@ ixgbe_msix_link(void *arg) device_printf(adapter->dev, "System shutdown required!\n"); IXGBE_WRITE_REG(hw, IXGBE_EICR, IXGBE_EICR_TS); } +#ifdef PCI_IOV + if (reg_eicr & IXGBE_EICR_MAILBOX) + taskqueue_enqueue(adapter->tq, &adapter->mbx_task); +#endif } /* Pluggable optics-related interrupt */ @@ -1580,7 +1701,7 @@ ixgbe_media_status(struct ifnet * ifp, s } ifmr->ifm_status |= IFM_ACTIVE; - layer = ixgbe_get_supported_physical_layer(hw); + layer = adapter->phy_layer; if (layer & IXGBE_PHYSICAL_LAYER_10GBASE_T || layer & IXGBE_PHYSICAL_LAYER_1000BASE_T || @@ -1813,18 +1934,17 @@ ixgbe_set_promisc(struct adapter *adapte static void ixgbe_set_multi(struct adapter *adapter) { - u32 fctrl; - u8 *mta; - u8 *update_ptr; - struct ifmultiaddr *ifma; - int mcnt = 0; - struct ifnet *ifp = adapter->ifp; + u32 fctrl; + u8 *update_ptr; + struct ifmultiaddr *ifma; + struct ixgbe_mc_addr *mta; + int mcnt = 0; + struct ifnet *ifp = adapter->ifp; IOCTL_DEBUGOUT("ixgbe_set_multi: begin"); mta = adapter->mta; - bzero(mta, sizeof(u8) * IXGBE_ETH_LENGTH_OF_ADDRESS * - MAX_NUM_MULTICAST_ADDRESSES); + bzero(mta, sizeof(*mta) * MAX_NUM_MULTICAST_ADDRESSES); #if __FreeBSD_version < 800000 IF_ADDR_LOCK(ifp); @@ -1837,8 +1957,8 @@ ixgbe_set_multi(struct adapter *adapter) if (mcnt == MAX_NUM_MULTICAST_ADDRESSES) break; bcopy(LLADDR((struct sockaddr_dl *) ifma->ifma_addr), - &mta[mcnt * IXGBE_ETH_LENGTH_OF_ADDRESS], - IXGBE_ETH_LENGTH_OF_ADDRESS); + mta[mcnt].addr, IXGBE_ETH_LENGTH_OF_ADDRESS); + mta[mcnt].vmdq = adapter->pool; mcnt++; } #if __FreeBSD_version < 800000 @@ -1861,7 +1981,7 @@ ixgbe_set_multi(struct adapter *adapter) IXGBE_WRITE_REG(&adapter->hw, IXGBE_FCTRL, fctrl); if (mcnt < MAX_NUM_MULTICAST_ADDRESSES) { - update_ptr = mta; + update_ptr = (u8 *)mta; ixgbe_update_mc_addr_list(&adapter->hw, update_ptr, mcnt, ixgbe_mc_array_itr, TRUE); } @@ -1877,13 +1997,13 @@ ixgbe_set_multi(struct adapter *adapter) static u8 * ixgbe_mc_array_itr(struct ixgbe_hw *hw, u8 **update_ptr, u32 *vmdq) { - u8 *addr = *update_ptr; - u8 *newptr; - *vmdq = 0; - - newptr = addr + IXGBE_ETH_LENGTH_OF_ADDRESS; - *update_ptr = newptr; - return addr; + struct ixgbe_mc_addr *mta; + + mta = (struct ixgbe_mc_addr *)*update_ptr; + *vmdq = mta->vmdq; + + *update_ptr = (u8*)(mta + 1);; + return (mta->addr); } @@ -1965,6 +2085,7 @@ watchdog: ixgbe_init_locked(adapter); } + /* ** Note: this routine updates the OS on the link state ** the real check of the hardware only happens with @@ -1988,6 +2109,9 @@ ixgbe_update_link_status(struct adapter /* Update DMA coalescing config */ ixgbe_config_dmac(adapter); if_link_state_change(ifp, LINK_STATE_UP); +#ifdef PCI_IOV + ixgbe_ping_all_vfs(adapter); +#endif } } else { /* Link down */ if (adapter->link_active == TRUE) { @@ -1995,6 +2119,9 @@ ixgbe_update_link_status(struct adapter device_printf(dev,"Link is Down\n"); if_link_state_change(ifp, LINK_STATE_DOWN); adapter->link_active = FALSE; +#ifdef PCI_IOV + ixgbe_ping_all_vfs(adapter); +#endif } } @@ -2094,7 +2221,7 @@ ixgbe_setup_optics(struct adapter *adapt struct ixgbe_hw *hw = &adapter->hw; int layer; - layer = ixgbe_get_supported_physical_layer(hw); + layer = adapter->phy_layer = ixgbe_get_supported_physical_layer(hw); if (layer & IXGBE_PHYSICAL_LAYER_10GBASE_T) { adapter->optics = IFM_10G_T; @@ -2223,6 +2350,31 @@ ixgbe_allocate_msix(struct adapter *adap struct tx_ring *txr = adapter->tx_rings; int error, rid, vector = 0; int cpu_id = 0; +#ifdef RSS + cpuset_t cpu_mask; +#endif + +#ifdef RSS + /* + * If we're doing RSS, the number of queues needs to + * match the number of RSS buckets that are configured. + * + * + If there's more queues than RSS buckets, we'll end + * up with queues that get no traffic. + * + * + If there's more RSS buckets than queues, we'll end + * up having multiple RSS buckets map to the same queue, + * so there'll be some contention. + */ + if (adapter->num_queues != rss_getnumbuckets()) { + device_printf(dev, + "%s: number of queues (%d) != number of RSS buckets (%d)" + "; performance will be impacted.\n", + __func__, + adapter->num_queues, + rss_getnumbuckets()); + } +#endif for (int i = 0; i < adapter->num_queues; i++, vector++, que++, txr++) { rid = vector + 1; @@ -2247,6 +2399,14 @@ ixgbe_allocate_msix(struct adapter *adap #endif que->msix = vector; adapter->active_queues |= (u64)(1 << que->msix); +#ifdef RSS + /* + * The queue ID is used as the RSS layer bucket ID. + * We look up the queue ID -> RSS CPU ID and select + * that. + */ + cpu_id = rss_getcpu(i % rss_getnumbuckets()); +#else /* * Bind the msix vector, and thus the * rings to the corresponding cpu. @@ -2256,9 +2416,21 @@ ixgbe_allocate_msix(struct adapter *adap */ if (adapter->num_queues > 1) cpu_id = i; - +#endif if (adapter->num_queues > 1) bus_bind_intr(dev, que->res, cpu_id); +#ifdef IXGBE_DEBUG +#ifdef RSS + device_printf(dev, + "Bound RSS bucket %d to CPU %d\n", + i, cpu_id); +#else + device_printf(dev, + "Bound queue %d to cpu %d\n", + i, cpu_id); +#endif +#endif /* IXGBE_DEBUG */ + #ifndef IXGBE_LEGACY_TX TASK_INIT(&txr->txq_task, 0, ixgbe_deferred_mq_start, txr); @@ -2266,8 +2438,17 @@ ixgbe_allocate_msix(struct adapter *adap TASK_INIT(&que->que_task, 0, ixgbe_handle_que, que); que->tq = taskqueue_create_fast("ixgbe_que", M_NOWAIT, taskqueue_thread_enqueue, &que->tq); +#ifdef RSS + CPU_SETOF(cpu_id, &cpu_mask); + taskqueue_start_threads_cpuset(&que->tq, 1, PI_NET, + &cpu_mask, + "%s (bucket %d)", + device_get_nameunit(adapter->dev), + cpu_id); +#else taskqueue_start_threads(&que->tq, 1, PI_NET, "%s que", device_get_nameunit(adapter->dev)); +#endif } /* and Link */ @@ -2296,6 +2477,9 @@ ixgbe_allocate_msix(struct adapter *adap TASK_INIT(&adapter->link_task, 0, ixgbe_handle_link, adapter); TASK_INIT(&adapter->mod_task, 0, ixgbe_handle_mod, adapter); TASK_INIT(&adapter->msf_task, 0, ixgbe_handle_msf, adapter); +#ifdef PCI_IOV + TASK_INIT(&adapter->mbx_task, 0, ixgbe_handle_mbx, adapter); +#endif TASK_INIT(&adapter->phy_task, 0, ixgbe_handle_phy, adapter); #ifdef IXGBE_FDIR TASK_INIT(&adapter->fdir_task, 0, ixgbe_reinit_fdir, adapter); @@ -2343,11 +2527,14 @@ ixgbe_setup_msix(struct adapter *adapter /* Figure out a reasonable auto config value */ queues = (mp_ncpus > (msgs-1)) ? (msgs-1) : mp_ncpus; +#ifdef RSS + /* If we're doing RSS, clamp at the number of RSS buckets */ + if (queues > rss_getnumbuckets()) + queues = rss_getnumbuckets(); +#endif + if (ixgbe_num_queues != 0) queues = ixgbe_num_queues; - /* Set max queues to 8 when autoconfiguring */ - else if ((ixgbe_num_queues == 0) && (queues > 8)) - queues = 8; /* reflect correct sysctl value */ ixgbe_num_queues = queues; @@ -2511,15 +2698,20 @@ ixgbe_setup_interface(device_t dev, stru return (-1); } if_initname(ifp, device_get_name(dev), device_get_unit(dev)); - if_initbaudrate(ifp, IF_Gbps(10)); + ifp->if_baudrate = IF_Gbps(10); ifp->if_init = ixgbe_init; ifp->if_softc = adapter; ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST; ifp->if_ioctl = ixgbe_ioctl; +#if __FreeBSD_version >= 1100036 + if_setgetcounterfn(ifp, ixgbe_get_counter); +#endif +#if __FreeBSD_version >= 1100045 /* TSO parameters */ ifp->if_hw_tsomax = 65518; ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER; ifp->if_hw_tsomaxsegsize = 2048; +#endif #ifndef IXGBE_LEGACY_TX ifp->if_transmit = ixgbe_mq_start; ifp->if_qflush = ixgbe_qflush; @@ -2581,7 +2773,7 @@ ixgbe_add_media_types(struct adapter *ad device_t dev = adapter->dev; int layer; - layer = ixgbe_get_supported_physical_layer(hw); + layer = adapter->phy_layer = ixgbe_get_supported_physical_layer(hw); /* Media types with matching FreeBSD media defines */ if (layer & IXGBE_PHYSICAL_LAYER_10GBASE_T) @@ -2692,40 +2884,41 @@ ixgbe_initialize_transmit_units(struct a for (int i = 0; i < adapter->num_queues; i++, txr++) { u64 tdba = txr->txdma.dma_paddr; u32 txctrl = 0; + int j = txr->me; - IXGBE_WRITE_REG(hw, IXGBE_TDBAL(i), + IXGBE_WRITE_REG(hw, IXGBE_TDBAL(j), (tdba & 0x00000000ffffffffULL)); - IXGBE_WRITE_REG(hw, IXGBE_TDBAH(i), (tdba >> 32)); - IXGBE_WRITE_REG(hw, IXGBE_TDLEN(i), + IXGBE_WRITE_REG(hw, IXGBE_TDBAH(j), (tdba >> 32)); + IXGBE_WRITE_REG(hw, IXGBE_TDLEN(j), adapter->num_tx_desc * sizeof(union ixgbe_adv_tx_desc)); /* Setup the HW Tx Head and Tail descriptor pointers */ - IXGBE_WRITE_REG(hw, IXGBE_TDH(i), 0); - IXGBE_WRITE_REG(hw, IXGBE_TDT(i), 0); + IXGBE_WRITE_REG(hw, IXGBE_TDH(j), 0); + IXGBE_WRITE_REG(hw, IXGBE_TDT(j), 0); /* Cache the tail address */ - txr->tail = IXGBE_TDT(txr->me); + txr->tail = IXGBE_TDT(j); /* Disable Head Writeback */ switch (hw->mac.type) { case ixgbe_mac_82598EB: - txctrl = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL(i)); + txctrl = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL(j)); break; case ixgbe_mac_82599EB: case ixgbe_mac_X540: default: - txctrl = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL_82599(i)); + txctrl = IXGBE_READ_REG(hw, IXGBE_DCA_TXCTRL_82599(j)); break; } txctrl &= ~IXGBE_DCA_TXCTRL_DESC_WRO_EN; switch (hw->mac.type) { case ixgbe_mac_82598EB: - IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL(i), txctrl); + IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL(j), txctrl); break; case ixgbe_mac_82599EB: case ixgbe_mac_X540: default: - IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL_82599(i), txctrl); + IXGBE_WRITE_REG(hw, IXGBE_DCA_TXCTRL_82599(j), txctrl); break; } @@ -2733,6 +2926,9 @@ ixgbe_initialize_transmit_units(struct a if (hw->mac.type != ixgbe_mac_82598EB) { u32 dmatxctl, rttdcs; +#ifdef PCI_IOV + enum ixgbe_iov_mode mode = ixgbe_get_iov_mode(adapter); +#endif dmatxctl = IXGBE_READ_REG(hw, IXGBE_DMATXCTL); dmatxctl |= IXGBE_DMATXCTL_TE; IXGBE_WRITE_REG(hw, IXGBE_DMATXCTL, dmatxctl); @@ -2740,7 +2936,11 @@ ixgbe_initialize_transmit_units(struct a rttdcs = IXGBE_READ_REG(hw, IXGBE_RTTDCS); rttdcs |= IXGBE_RTTDCS_ARBDIS; IXGBE_WRITE_REG(hw, IXGBE_RTTDCS, rttdcs); +#ifdef PCI_IOV + IXGBE_WRITE_REG(hw, IXGBE_MTQC, ixgbe_get_mtqc(mode)); +#else IXGBE_WRITE_REG(hw, IXGBE_MTQC, IXGBE_MTQC_64Q_1PB); +#endif rttdcs &= ~IXGBE_RTTDCS_ARBDIS; IXGBE_WRITE_REG(hw, IXGBE_RTTDCS, rttdcs); } @@ -2752,17 +2952,22 @@ static void ixgbe_initialise_rss_mapping(struct adapter *adapter) { struct ixgbe_hw *hw = &adapter->hw; - uint32_t reta; - int i, j, queue_id, table_size; - int index_mult; - uint32_t rss_key[10]; - uint32_t mrqc; - - /* Setup RSS */ - reta = 0; + u32 reta = 0, mrqc, rss_key[10]; + int queue_id, table_size, index_mult; +#ifdef RSS + u32 rss_hash_config; +#endif +#ifdef PCI_IOV + enum ixgbe_iov_mode mode; +#endif +#ifdef RSS + /* Fetch the configured RSS key */ + rss_getkey((uint8_t *) &rss_key); +#else /* set up random bits */ arc4rand(&rss_key, sizeof(rss_key), 0); +#endif /* Set multiplier for RETA setup and table size based on MAC */ index_mult = 0x1; @@ -2780,9 +2985,19 @@ ixgbe_initialise_rss_mapping(struct adap } /* Set up the redirection table */ - for (i = 0, j = 0; i < table_size; i++, j++) { + for (int i = 0, j = 0; i < table_size; i++, j++) { if (j == adapter->num_queues) j = 0; +#ifdef RSS + /* + * Fetch the RSS bucket id for the given indirection entry. + * Cap it at the number of configured buckets (which is + * num_queues.) + */ + queue_id = rss_get_indirection_to_bucket(i); + queue_id = queue_id % adapter->num_queues; +#else queue_id = (j * index_mult); +#endif /* * The low 8 bits are for hash value (n+0); * The next 8 bits are for hash value (n+1), etc. @@ -2803,6 +3018,32 @@ ixgbe_initialise_rss_mapping(struct adap IXGBE_WRITE_REG(hw, IXGBE_RSSRK(i), rss_key[i]); /* Perform hash on these packet types */ +#ifdef RSS + mrqc = IXGBE_MRQC_RSSEN; + rss_hash_config = rss_gethashconfig(); + if (rss_hash_config & RSS_HASHTYPE_RSS_IPV4) + mrqc |= IXGBE_MRQC_RSS_FIELD_IPV4; + if (rss_hash_config & RSS_HASHTYPE_RSS_TCP_IPV4) + mrqc |= IXGBE_MRQC_RSS_FIELD_IPV4_TCP; + if (rss_hash_config & RSS_HASHTYPE_RSS_IPV6) + mrqc |= IXGBE_MRQC_RSS_FIELD_IPV6; + if (rss_hash_config & RSS_HASHTYPE_RSS_TCP_IPV6) + mrqc |= IXGBE_MRQC_RSS_FIELD_IPV6_TCP; + if (rss_hash_config & RSS_HASHTYPE_RSS_IPV6_EX) + mrqc |= IXGBE_MRQC_RSS_FIELD_IPV6_EX; + if (rss_hash_config & RSS_HASHTYPE_RSS_TCP_IPV6_EX) + mrqc |= IXGBE_MRQC_RSS_FIELD_IPV6_EX_TCP; + if (rss_hash_config & RSS_HASHTYPE_RSS_UDP_IPV4) + mrqc |= IXGBE_MRQC_RSS_FIELD_IPV4_UDP; + if (rss_hash_config & RSS_HASHTYPE_RSS_UDP_IPV4_EX) + device_printf(adapter->dev, + "%s: RSS_HASHTYPE_RSS_UDP_IPV4_EX defined, " + "but not supported\n", __func__); + if (rss_hash_config & RSS_HASHTYPE_RSS_UDP_IPV6) + mrqc |= IXGBE_MRQC_RSS_FIELD_IPV6_UDP; + if (rss_hash_config & RSS_HASHTYPE_RSS_UDP_IPV6_EX) + mrqc |= IXGBE_MRQC_RSS_FIELD_IPV6_EX_UDP; +#else /* * Disable UDP - IP fragments aren't currently being handled * and so we end up with a mix of 2-tuple and 4-tuple @@ -2811,18 +3052,16 @@ ixgbe_initialise_rss_mapping(struct adap mrqc = IXGBE_MRQC_RSSEN | IXGBE_MRQC_RSS_FIELD_IPV4 | IXGBE_MRQC_RSS_FIELD_IPV4_TCP -#if 0 - | IXGBE_MRQC_RSS_FIELD_IPV4_UDP -#endif | IXGBE_MRQC_RSS_FIELD_IPV6_EX_TCP | IXGBE_MRQC_RSS_FIELD_IPV6_EX | IXGBE_MRQC_RSS_FIELD_IPV6 | IXGBE_MRQC_RSS_FIELD_IPV6_TCP -#if 0 - | IXGBE_MRQC_RSS_FIELD_IPV6_UDP - | IXGBE_MRQC_RSS_FIELD_IPV6_EX_UDP -#endif ; +#endif /* RSS */ +#ifdef PCI_IOV + mode = ixgbe_get_iov_mode(adapter); + mrqc |= ixgbe_get_mrqc(mode); +#endif IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc); } @@ -2881,16 +3120,17 @@ ixgbe_initialize_receive_units(struct ad for (int i = 0; i < adapter->num_queues; i++, rxr++) { u64 rdba = rxr->rxdma.dma_paddr; + int j = rxr->me; /* Setup the Base and Length of the Rx Descriptor Ring */ - IXGBE_WRITE_REG(hw, IXGBE_RDBAL(i), + IXGBE_WRITE_REG(hw, IXGBE_RDBAL(j), (rdba & 0x00000000ffffffffULL)); - IXGBE_WRITE_REG(hw, IXGBE_RDBAH(i), (rdba >> 32)); - IXGBE_WRITE_REG(hw, IXGBE_RDLEN(i), + IXGBE_WRITE_REG(hw, IXGBE_RDBAH(j), (rdba >> 32)); + IXGBE_WRITE_REG(hw, IXGBE_RDLEN(j), adapter->num_rx_desc * sizeof(union ixgbe_adv_rx_desc)); /* Set up the SRRCTL register */ - srrctl = IXGBE_READ_REG(hw, IXGBE_SRRCTL(i)); + srrctl = IXGBE_READ_REG(hw, IXGBE_SRRCTL(j)); srrctl &= ~IXGBE_SRRCTL_BSIZEHDR_MASK; srrctl &= ~IXGBE_SRRCTL_BSIZEPKT_MASK; srrctl |= bufsz; @@ -3026,9 +3266,9 @@ ixgbe_setup_vlan_hw_support(struct adapt rxr = &adapter->rx_rings[i]; /* On 82599 the VLAN enable is per/queue in RXDCTL */ if (hw->mac.type != ixgbe_mac_82598EB) { - ctrl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(i)); + ctrl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(rxr->me)); ctrl |= IXGBE_RXDCTL_VME; - IXGBE_WRITE_REG(hw, IXGBE_RXDCTL(i), ctrl); + IXGBE_WRITE_REG(hw, IXGBE_RXDCTL(rxr->me), ctrl); } rxr->vtag_strip = TRUE; } @@ -3078,6 +3318,9 @@ ixgbe_enable_intr(struct adapter *adapte #ifdef IXGBE_FDIR mask |= IXGBE_EIMS_FLOW_DIR; #endif +#ifdef PCI_IOV + mask |= IXGBE_EIMS_MAILBOX; +#endif break; case ixgbe_mac_X540: /* Detect if Thermal Sensor is enabled */ @@ -3101,6 +3344,9 @@ ixgbe_enable_intr(struct adapter *adapte #ifdef IXGBE_FDIR mask |= IXGBE_EIMS_FLOW_DIR; #endif +#ifdef PCI_IOV + mask |= IXGBE_EIMS_MAILBOX; +#endif /* falls through */ default: break; @@ -3114,6 +3360,9 @@ ixgbe_enable_intr(struct adapter *adapte /* Don't autoclear Link */ mask &= ~IXGBE_EIMS_OTHER; mask &= ~IXGBE_EIMS_LSC; +#ifdef PCI_IOV + mask &= ~IXGBE_EIMS_MAILBOX; +#endif IXGBE_WRITE_REG(hw, IXGBE_EIAC, mask); } @@ -3312,8 +3561,8 @@ ixgbe_set_ivar(struct adapter *adapter, static void ixgbe_configure_ivars(struct adapter *adapter) { - struct ix_queue *que = adapter->queues; - u32 newitr; + struct ix_queue *que = adapter->queues; + u32 newitr; if (ixgbe_max_interrupt_rate > 0) newitr = (4000000 / ixgbe_max_interrupt_rate) & 0x0FF8; @@ -3327,10 +3576,12 @@ ixgbe_configure_ivars(struct adapter *ad } for (int i = 0; i < adapter->num_queues; i++, que++) { + struct rx_ring *rxr = &adapter->rx_rings[i]; + struct tx_ring *txr = &adapter->tx_rings[i]; /* First the RX queue entry */ - ixgbe_set_ivar(adapter, i, que->msix, 0); + ixgbe_set_ivar(adapter, rxr->me, que->msix, 0); /* ... and the TX */ - ixgbe_set_ivar(adapter, i, que->msix, 1); *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***