From owner-svn-src-user@FreeBSD.ORG Fri Nov 7 23:50:58 2008 Return-Path: Delivered-To: svn-src-user@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 325C11065676; Fri, 7 Nov 2008 23:50:58 +0000 (UTC) (envelope-from kmacy@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 1FC898FC19; Fri, 7 Nov 2008 23:50:58 +0000 (UTC) (envelope-from kmacy@FreeBSD.org) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id mA7Now6M095992; Fri, 7 Nov 2008 23:50:58 GMT (envelope-from kmacy@svn.freebsd.org) Received: (from kmacy@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id mA7NovvQ095986; Fri, 7 Nov 2008 23:50:57 GMT (envelope-from kmacy@svn.freebsd.org) Message-Id: <200811072350.mA7NovvQ095986@svn.freebsd.org> From: Kip Macy Date: Fri, 7 Nov 2008 23:50:57 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org X-SVN-Group: user MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r184758 - in user/kmacy/HEAD_fast_multi_xmit/sys: amd64/conf conf dev/mxge i386/conf kern net netinet sys X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Nov 2008 23:50:58 -0000 Author: kmacy Date: Fri Nov 7 23:50:57 2008 New Revision: 184758 URL: http://svn.freebsd.org/changeset/base/184758 Log: Import multi-tx, flowtable, and tcp rlocking changes Added: user/kmacy/HEAD_fast_multi_xmit/sys/amd64/conf/PERFTEST user/kmacy/HEAD_fast_multi_xmit/sys/i386/conf/PERFTEST user/kmacy/HEAD_fast_multi_xmit/sys/net/flowtable.c (contents, props changed) user/kmacy/HEAD_fast_multi_xmit/sys/net/flowtable.h (contents, props changed) Modified: user/kmacy/HEAD_fast_multi_xmit/sys/conf/files user/kmacy/HEAD_fast_multi_xmit/sys/dev/mxge/if_mxge.c user/kmacy/HEAD_fast_multi_xmit/sys/dev/mxge/if_mxge_var.h user/kmacy/HEAD_fast_multi_xmit/sys/kern/kern_mbuf.c user/kmacy/HEAD_fast_multi_xmit/sys/net/if.c user/kmacy/HEAD_fast_multi_xmit/sys/net/if_ethersubr.c user/kmacy/HEAD_fast_multi_xmit/sys/net/if_lagg.c user/kmacy/HEAD_fast_multi_xmit/sys/net/if_var.h user/kmacy/HEAD_fast_multi_xmit/sys/net/if_vlan.c user/kmacy/HEAD_fast_multi_xmit/sys/net/radix_mpath.c user/kmacy/HEAD_fast_multi_xmit/sys/net/route.c user/kmacy/HEAD_fast_multi_xmit/sys/net/route.h user/kmacy/HEAD_fast_multi_xmit/sys/netinet/if_ether.c user/kmacy/HEAD_fast_multi_xmit/sys/netinet/in_pcb.h user/kmacy/HEAD_fast_multi_xmit/sys/netinet/ip_input.c user/kmacy/HEAD_fast_multi_xmit/sys/netinet/ip_output.c user/kmacy/HEAD_fast_multi_xmit/sys/netinet/ip_var.h user/kmacy/HEAD_fast_multi_xmit/sys/netinet/tcp_input.c user/kmacy/HEAD_fast_multi_xmit/sys/sys/mbuf.h Added: user/kmacy/HEAD_fast_multi_xmit/sys/amd64/conf/PERFTEST ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ user/kmacy/HEAD_fast_multi_xmit/sys/amd64/conf/PERFTEST Fri Nov 7 23:50:57 2008 (r184758) @@ -0,0 +1,281 @@ +# +# GENERIC -- Generic kernel configuration file for FreeBSD/amd64 +# +# For more information on this file, please read the handbook section on +# Kernel Configuration Files: +# +# http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-config.html +# +# The handbook is also available locally in /usr/share/doc/handbook +# if you've installed the doc distribution, otherwise always see the +# FreeBSD World Wide Web server (http://www.FreeBSD.org/) for the +# latest information. +# +# An exhaustive list of options and more detailed explanations of the +# device lines is also present in the ../../conf/NOTES and NOTES files. +# If you are in doubt as to the purpose or necessity of a line, check first +# in NOTES. +# +# $FreeBSD: user/kmacy/HEAD_multi_tx/sys/amd64/conf/GENERIC 183567 2008-10-03 10:31:31Z stas $ + +cpu HAMMER +ident GENERIC + +# To statically compile in device wiring instead of /boot/device.hints +#hints "GENERIC.hints" # Default places to look for devices. + +makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols + +options SCHED_ULE # ULE scheduler +options PREEMPTION # Enable kernel thread preemption +options INET # InterNETworking +options INET6 # IPv6 communications protocols +options SCTP # Stream Control Transmission Protocol +options FFS # Berkeley Fast Filesystem +options SOFTUPDATES # Enable FFS soft updates support +options UFS_ACL # Support for access control lists +options UFS_DIRHASH # Improve performance on big directories +options UFS_GJOURNAL # Enable gjournal-based UFS journaling +options MD_ROOT # MD is a potential root device +options NFSCLIENT # Network Filesystem Client +options NFSSERVER # Network Filesystem Server +options NFSLOCKD # Network Lock Manager +options NFS_ROOT # NFS usable as /, requires NFSCLIENT +options NTFS # NT File System +options MSDOSFS # MSDOS Filesystem +options CD9660 # ISO 9660 Filesystem +options PROCFS # Process filesystem (requires PSEUDOFS) +options PSEUDOFS # Pseudo-filesystem framework +options GEOM_PART_GPT # GUID Partition Tables. +options GEOM_LABEL # Provides labelization +options COMPAT_43TTY # BSD 4.3 TTY compat [KEEP THIS!] +options COMPAT_IA32 # Compatible with i386 binaries +options COMPAT_FREEBSD4 # Compatible with FreeBSD4 +options COMPAT_FREEBSD5 # Compatible with FreeBSD5 +options COMPAT_FREEBSD6 # Compatible with FreeBSD6 +options COMPAT_FREEBSD7 # Compatible with FreeBSD7 +options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI +options KTRACE # ktrace(1) support +options STACK # stack(9) support +options SYSVSHM # SYSV-style shared memory +options SYSVMSG # SYSV-style message queues +options SYSVSEM # SYSV-style semaphores +options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions +options KBD_INSTALL_CDEV # install a CDEV entry in /dev +options STOP_NMI # Stop CPUS using NMI instead of IPI +options AUDIT # Security event auditing +options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4) + +# Debugging for use in -current +options KDB # Enable kernel debugger support. +options DDB # Support DDB. +options GDB # Support remote GDB. + +options LOCK_PROFILING + +# Make an SMP-capable kernel by default +options SMP # Symmetric MultiProcessor Kernel + +# CPU frequency control +device cpufreq + +# Bus support. +device acpi +device pci + +# Floppy drives +device fdc + +# ATA and ATAPI devices +device ata +device atadisk # ATA disk drives +device ataraid # ATA RAID drives +device atapicd # ATAPI CDROM drives +device atapifd # ATAPI floppy drives +device atapist # ATAPI tape drives +options ATA_STATIC_ID # Static device numbering + +# SCSI Controllers +device ahc # AHA2940 and onboard AIC7xxx devices +options AHC_REG_PRETTY_PRINT # Print register bitfields in debug + # output. Adds ~128k to driver. +device ahd # AHA39320/29320 and onboard AIC79xx devices +options AHD_REG_PRETTY_PRINT # Print register bitfields in debug + # output. Adds ~215k to driver. +device amd # AMD 53C974 (Tekram DC-390(T)) +device hptiop # Highpoint RocketRaid 3xxx series +device isp # Qlogic family +#device ispfw # Firmware for QLogic HBAs- normally a module +device mpt # LSI-Logic MPT-Fusion +#device ncr # NCR/Symbios Logic +device sym # NCR/Symbios Logic (newer chipsets + those of `ncr') +device trm # Tekram DC395U/UW/F DC315U adapters + +device adv # Advansys SCSI adapters +device adw # Advansys wide SCSI adapters +device aic # Adaptec 15[012]x SCSI adapters, AIC-6[23]60. +device bt # Buslogic/Mylex MultiMaster SCSI adapters + +# SCSI peripherals +device scbus # SCSI bus (required for SCSI) +device ch # SCSI media changers +device da # Direct Access (disks) +device sa # Sequential Access (tape etc) +device cd # CD +device pass # Passthrough device (direct SCSI access) +device ses # SCSI Environmental Services (and SAF-TE) + +# RAID controllers interfaced to the SCSI subsystem +device amr # AMI MegaRAID +device arcmsr # Areca SATA II RAID +device ciss # Compaq Smart RAID 5* +device dpt # DPT Smartcache III, IV - See NOTES for options +device hptmv # Highpoint RocketRAID 182x +device hptrr # Highpoint RocketRAID 17xx, 22xx, 23xx, 25xx +device iir # Intel Integrated RAID +device ips # IBM (Adaptec) ServeRAID +device mly # Mylex AcceleRAID/eXtremeRAID +device twa # 3ware 9000 series PATA/SATA RAID + +# RAID controllers +device aac # Adaptec FSA RAID +device aacp # SCSI passthrough for aac (requires CAM) +device ida # Compaq Smart RAID +device mfi # LSI MegaRAID SAS +device mlx # Mylex DAC960 family +#XXX pointer/int warnings +#device pst # Promise Supertrak SX6000 +device twe # 3ware ATA RAID + +# atkbdc0 controls both the keyboard and the PS/2 mouse +device atkbdc # AT keyboard controller +device atkbd # AT keyboard +device psm # PS/2 mouse + +device kbdmux # keyboard multiplexer + +device vga # VGA video card driver + +device splash # Splash screen and screen saver support + +# syscons is the default console driver, resembling an SCO console +device sc + +device agp # support several AGP chipsets + +# PCCARD (PCMCIA) support +# PCMCIA and cardbus bridge support +device cbb # cardbus (yenta) bridge +device pccard # PC Card (16-bit) bus +device cardbus # CardBus (32-bit) bus + +# Serial (COM) ports +device uart # Generic UART driver + +# Parallel port +device ppc +device ppbus # Parallel port bus (required) +device lpt # Printer +device plip # TCP/IP over parallel +device ppi # Parallel port interface device +#device vpo # Requires scbus and da + +# If you've got a "dumb" serial or parallel PCI card that is +# supported by the puc(4) glue driver, uncomment the following +# line to enable it (connects to sio, uart and/or ppc drivers): +#device puc + +# PCI Ethernet NICs. +device de # DEC/Intel DC21x4x (``Tulip'') +device em # Intel PRO/1000 Gigabit Ethernet Family +device igb # Intel PRO/1000 PCIE Server Gigabit Family +device le # AMD Am7900 LANCE and Am79C9xx PCnet +device ti # Alteon Networks Tigon I/II gigabit Ethernet + +# PCI Ethernet NICs that use the common MII bus controller code. +# NOTE: Be sure to keep the 'device miibus' line in order to use these NICs! +device miibus # MII bus support +device ae # Attansic/Atheros L2 FastEthernet +device age # Attansic/Atheros L1 Gigabit Ethernet +device bce # Broadcom BCM5706/BCM5708 Gigabit Ethernet +device bfe # Broadcom BCM440x 10/100 Ethernet +device bge # Broadcom BCM570xx Gigabit Ethernet +device dc # DEC/Intel 21143 and various workalikes +device et # Agere ET1310 10/100/Gigabit Ethernet +device fxp # Intel EtherExpress PRO/100B (82557, 82558) +device jme # JMicron JMC250 Gigabit/JMC260 Fast Ethernet +device lge # Level 1 LXT1001 gigabit Ethernet +device msk # Marvell/SysKonnect Yukon II Gigabit Ethernet +device nfe # nVidia nForce MCP on-board Ethernet +device nge # NatSemi DP83820 gigabit Ethernet +#device nve # nVidia nForce MCP on-board Ethernet Networking +device pcn # AMD Am79C97x PCI 10/100 (precedence over 'le') +device re # RealTek 8139C+/8169/8169S/8110S +device rl # RealTek 8129/8139 +device sf # Adaptec AIC-6915 (``Starfire'') +device sis # Silicon Integrated Systems SiS 900/SiS 7016 +device sk # SysKonnect SK-984x & SK-982x gigabit Ethernet +device ste # Sundance ST201 (D-Link DFE-550TX) +device tl # Texas Instruments ThunderLAN +device tx # SMC EtherPower II (83c170 ``EPIC'') +device vge # VIA VT612x gigabit Ethernet +device vr # VIA Rhine, Rhine II +device wb # Winbond W89C840F +device xl # 3Com 3c90x (``Boomerang'', ``Cyclone'') + +# Pseudo devices. +device loop # Network loopback +device random # Entropy device +device ether # Ethernet support +device tun # Packet tunnel. +device pty # BSD-style compatibility pseudo ttys +device md # Memory "disks" +device gif # IPv6 and IPv4 tunneling +device faith # IPv6-to-IPv4 relaying (translation) +device firmware # firmware assist module + +# The `bpf' device enables the Berkeley Packet Filter. +# Be aware of the administrative consequences of enabling this! +# Note that 'bpf' is required for DHCP. +device bpf # Berkeley packet filter + +# USB support +device uhci # UHCI PCI->USB interface +device ohci # OHCI PCI->USB interface +device ehci # EHCI PCI->USB interface (USB 2.0) +device usb # USB Bus (required) +#device udbp # USB Double Bulk Pipe devices +device ugen # Generic +device uhid # "Human Interface Devices" +device ukbd # Keyboard +device ulpt # Printer +device umass # Disks/Mass storage - Requires scbus and da +device ums # Mouse +device urio # Diamond Rio 500 MP3 player +device uscanner # Scanners +# USB Serial devices +device ucom # Generic com ttys +device uark # Technologies ARK3116 based serial adapters +device ubsa # Belkin F5U103 and compatible serial adapters +device uftdi # For FTDI usb serial adapters +device uipaq # Some WinCE based devices +device uplcom # Prolific PL-2303 serial adapters +device uslcom # SI Labs CP2101/CP2102 serial adapters +device uvisor # Visor and Palm devices +device uvscom # USB serial support for DDI pocket's PHS +# USB Ethernet, requires miibus +device aue # ADMtek USB Ethernet +device axe # ASIX Electronics USB Ethernet +device cdce # Generic USB over Ethernet +device cue # CATC USB Ethernet +device kue # Kawasaki LSI USB Ethernet +device rue # RealTek RTL8150 USB Ethernet +device udav # Davicom DM9601E USB + +# FireWire support +device firewire # FireWire bus code +device sbp # SCSI over FireWire (Requires scbus and da) +device fwe # Ethernet over FireWire (non-standard!) +device fwip # IP over FireWire (RFC 2734,3146) +device dcons # Dumb console driver +device dcons_crom # Configuration ROM for dcons Modified: user/kmacy/HEAD_fast_multi_xmit/sys/conf/files ============================================================================== --- user/kmacy/HEAD_fast_multi_xmit/sys/conf/files Fri Nov 7 22:36:19 2008 (r184757) +++ user/kmacy/HEAD_fast_multi_xmit/sys/conf/files Fri Nov 7 23:50:57 2008 (r184758) @@ -2017,6 +2017,7 @@ net/if_stf.c optional stf net/if_tun.c optional tun net/if_tap.c optional tap net/if_vlan.c optional vlan +net/flowtable.c optional inet net/mppcc.c optional netgraph_mppc_compression net/mppcd.c optional netgraph_mppc_compression net/netisr.c standard Modified: user/kmacy/HEAD_fast_multi_xmit/sys/dev/mxge/if_mxge.c ============================================================================== --- user/kmacy/HEAD_fast_multi_xmit/sys/dev/mxge/if_mxge.c Fri Nov 7 22:36:19 2008 (r184757) +++ user/kmacy/HEAD_fast_multi_xmit/sys/dev/mxge/if_mxge.c Fri Nov 7 23:50:57 2008 (r184758) @@ -1206,7 +1206,9 @@ mxge_reset(mxge_softc_t *sc, int interru * to setting up the interrupt queue DMA */ cmd.data0 = sc->num_slices; - cmd.data1 = MXGEFW_SLICE_INTR_MODE_ONE_PER_SLICE; + cmd.data1 = MXGEFW_SLICE_INTR_MODE_ONE_PER_SLICE | + MXGEFW_SLICE_ENABLE_MULTIPLE_TX_QUEUES; + status = mxge_send_cmd(sc, MXGEFW_CMD_ENABLE_RSS_QUEUES, &cmd); if (status != 0) { @@ -1266,6 +1268,9 @@ mxge_reset(mxge_softc_t *sc, int interru ss->tx.req = 0; ss->tx.done = 0; ss->tx.pkt_done = 0; + ss->tx.queue_active = 0; + ss->tx.activate = 0; + ss->tx.deactivate = 0; ss->tx.wake = 0; ss->tx.defrag = 0; ss->tx.stall = 0; @@ -1611,10 +1616,6 @@ mxge_add_sysctls(mxge_softc_t *sc) 0, "number of frames appended to lro merge" "queues"); - /* only transmit from slice 0 for now */ - if (slice > 0) - continue; - SYSCTL_ADD_INT(ctx, children, OID_AUTO, "tx_done", CTLFLAG_RD, &ss->tx.done, @@ -1635,6 +1636,18 @@ mxge_add_sysctls(mxge_softc_t *sc) "tx_defrag", CTLFLAG_RD, &ss->tx.defrag, 0, "tx_defrag"); + SYSCTL_ADD_INT(ctx, children, OID_AUTO, + "tx_queue_active", + CTLFLAG_RD, &ss->tx.queue_active, + 0, "tx_queue_active"); + SYSCTL_ADD_INT(ctx, children, OID_AUTO, + "tx_activate", + CTLFLAG_RD, &ss->tx.activate, + 0, "tx_activate"); + SYSCTL_ADD_INT(ctx, children, OID_AUTO, + "tx_deactivate", + CTLFLAG_RD, &ss->tx.deactivate, + 0, "tx_deactivate"); } } @@ -1857,12 +1870,19 @@ mxge_encap_tso(struct mxge_slice_state * tx->info[((cnt - 1) + tx->req) & tx->mask].flag = 1; mxge_submit_req(tx, tx->req_list, cnt); + if ((ss->sc->num_slices > 1) && tx->queue_active == 0) { + /* tell the NIC to start polling this slice */ + *tx->send_go = 1; + tx->queue_active = 1; + tx->activate++; + mb(); + } return; drop: bus_dmamap_unload(tx->dmat, tx->info[tx->req & tx->mask].map); m_freem(m); - ss->sc->ifp->if_oerrors++; + ss->oerrors++; if (!once) { printf("tx->max_desc exceeded via TSO!\n"); printf("mss = %d, %ld, %d!\n", mss, @@ -2059,11 +2079,18 @@ mxge_encap(struct mxge_slice_state *ss, #endif tx->info[((cnt - 1) + tx->req) & tx->mask].flag = 1; mxge_submit_req(tx, tx->req_list, cnt); + if ((ss->sc->num_slices > 1) && tx->queue_active == 0) { + /* tell the NIC to start polling this slice */ + *tx->send_go = 1; + tx->queue_active = 1; + tx->activate++; + mb(); + } return; drop: m_freem(m); - ifp->if_oerrors++; + ss->oerrors++; return; } @@ -2076,13 +2103,15 @@ mxge_start_locked(struct mxge_slice_stat mxge_softc_t *sc; struct mbuf *m; struct ifnet *ifp; + struct ifaltq *ifq; mxge_tx_ring_t *tx; sc = ss->sc; ifp = sc->ifp; tx = &ss->tx; + ifq = &tx->ifq; while ((tx->mask - (tx->req - tx->done)) > tx->max_desc) { - IFQ_DRV_DEQUEUE(&ifp->if_snd, m); + IFQ_DRV_DEQUEUE(ifq, m); if (m == NULL) { return; } @@ -2093,25 +2122,50 @@ mxge_start_locked(struct mxge_slice_stat mxge_encap(ss, m); } /* ran out of transmit slots */ - if ((sc->ifp->if_drv_flags & IFF_DRV_OACTIVE) == 0) { - sc->ifp->if_drv_flags |= IFF_DRV_OACTIVE; + if ((ss->if_drv_flags & IFF_DRV_OACTIVE) == 0) { + ss->if_drv_flags |= IFF_DRV_OACTIVE; tx->stall++; } } static void -mxge_start(struct ifnet *ifp) +mxge_start(struct mxge_slice_state *ss) { - mxge_softc_t *sc = ifp->if_softc; - struct mxge_slice_state *ss; - - /* only use the first slice for now */ - ss = &sc->ss[0]; mtx_lock(&ss->tx.mtx); mxge_start_locked(ss); mtx_unlock(&ss->tx.mtx); } +static int +mxge_transmit(struct ifnet *ifp, struct mbuf *m) +{ + struct ifaltq *ifq; + mxge_softc_t *sc = ifp->if_softc; + struct mxge_slice_state *ss; + int slice, error, len; + short mflags; + + /* + * XXX Andrew - this will only DTRT if num_slices is + * a power of 2 + */ + slice = m->m_pkthdr.flowid & (sc->num_slices - 1); +/* printf("%d & %d = %d\n", m->m_pkthdr.rss_hash, (sc->num_slices - 1), slice);*/ + ss = &sc->ss[slice]; + ifq = &ss->tx.ifq; + len = (m)->m_pkthdr.len; + mflags = (m)->m_flags; + IFQ_ENQUEUE(ifq, m, error); + if (error == 0) { + ss->obytes += len; + if (mflags & M_MCAST) + ss->omcasts++; + if ((ss->if_drv_flags & IFF_DRV_OACTIVE) == 0) + mxge_start(ss); + } + return (error); +} + /* * copy an array of mcp_kreq_ether_recv_t's to the mcp. Copy * at most 32 bytes at a time, so as to avoid involving the software @@ -2349,6 +2403,7 @@ mxge_rx_done_big(struct mxge_slice_state m->m_data += MXGEFW_PAD; m->m_pkthdr.rcvif = ifp; + m->m_pkthdr.flowid = ss - sc->ss; m->m_len = m->m_pkthdr.len = len; ss->ipackets++; eh = mtod(m, struct ether_header *); @@ -2410,6 +2465,7 @@ mxge_rx_done_small(struct mxge_slice_sta m->m_pkthdr.rcvif = ifp; m->m_len = m->m_pkthdr.len = len; + m->m_pkthdr.flowid = ss - sc->ss; ss->ipackets++; eh = mtod(m, struct ether_header *); if (eh->ether_type == htons(ETHERTYPE_VLAN)) { @@ -2480,7 +2536,7 @@ mxge_tx_done(struct mxge_slice_state *ss /* mbuf and DMA map only attached to the first segment per-mbuf */ if (m != NULL) { - ifp->if_opackets++; + ss->opackets++; tx->info[idx].m = NULL; map = tx->info[idx].map; bus_dmamap_unload(tx->dmat, map); @@ -2495,14 +2551,26 @@ mxge_tx_done(struct mxge_slice_state *ss /* If we have space, clear IFF_OACTIVE to tell the stack that its OK to send packets */ - if (ifp->if_drv_flags & IFF_DRV_OACTIVE && + if (ss->if_drv_flags & IFF_DRV_OACTIVE && tx->req - tx->done < (tx->mask + 1)/4) { mtx_lock(&ss->tx.mtx); - ifp->if_drv_flags &= ~IFF_DRV_OACTIVE; + ss->if_drv_flags &= ~IFF_DRV_OACTIVE; ss->tx.wake++; mxge_start_locked(ss); mtx_unlock(&ss->tx.mtx); } + if ((ss->sc->num_slices > 1) && (tx->req == tx->done) && + mtx_trylock(&ss->tx.mtx)) { + /* let the NIC stop polling this queue, since there + * are no more transmits pending */ + if (tx->req == tx->done) { + *tx->send_stop = 1; + tx->queue_active = 0; + tx->deactivate++; + mb(); + } + mtx_unlock(&ss->tx.mtx); + } } static struct mxge_media_type mxge_media_types[] = @@ -2653,14 +2721,6 @@ mxge_intr(void *arg) uint8_t valid; - /* an interrupt on a non-zero slice is implicitly valid - since MSI-X irqs are not shared */ - if (ss != sc->ss) { - mxge_clean_rx_done(ss); - *ss->irq_claim = be32toh(3); - return; - } - /* make sure the DMA has finished */ if (!stats->valid) { return; @@ -2683,7 +2743,8 @@ mxge_intr(void *arg) send_done_count = be32toh(stats->send_done_count); while ((send_done_count != tx->pkt_done) || (rx_done->entry[rx_done->idx].length != 0)) { - mxge_tx_done(ss, (int)send_done_count); + if (send_done_count != tx->pkt_done) + mxge_tx_done(ss, (int)send_done_count); mxge_clean_rx_done(ss); send_done_count = be32toh(stats->send_done_count); } @@ -2691,7 +2752,8 @@ mxge_intr(void *arg) mb(); } while (*((volatile uint8_t *) &stats->valid)); - if (__predict_false(stats->stats_updated)) { + /* fw stats meaningful only on the first slice */ + if (__predict_false((ss == sc->ss) && stats->stats_updated)) { if (sc->link_state != stats->link_up) { sc->link_state = stats->link_up; if (sc->link_state) { @@ -2981,10 +3043,6 @@ mxge_alloc_slice_rings(struct mxge_slice /* now allocate TX resouces */ - /* only use a single TX ring for now */ - if (ss != ss->sc->ss) - return 0; - ss->tx.mask = tx_ring_entries - 1; ss->tx.max_desc = MIN(MXGE_MAX_SEND_DESC, tx_ring_entries / 4); @@ -3043,8 +3101,11 @@ mxge_alloc_slice_rings(struct mxge_slice return err;; } } + IFQ_SET_MAXLEN(&ss->tx.ifq, tx_ring_entries - 1); + ss->tx.ifq.ifq_drv_maxlen = ss->tx.ifq.ifq_maxlen; + IFQ_SET_READY(&ss->tx.ifq); + return 0; - } static int @@ -3149,13 +3210,16 @@ mxge_slice_open(struct mxge_slice_state /* get the lanai pointers to the send and receive rings */ err = 0; - /* We currently only send from the first slice */ - if (slice == 0) { - cmd.data0 = slice; - err = mxge_send_cmd(sc, MXGEFW_CMD_GET_SEND_OFFSET, &cmd); - ss->tx.lanai = - (volatile mcp_kreq_ether_send_t *)(sc->sram + cmd.data0); - } + + cmd.data0 = slice; + err = mxge_send_cmd(sc, MXGEFW_CMD_GET_SEND_OFFSET, &cmd); + ss->tx.lanai = + (volatile mcp_kreq_ether_send_t *)(sc->sram + cmd.data0); + ss->tx.send_go = (volatile uint32_t *) + (sc->sram + MXGEFW_ETH_SEND_GO + 64 * slice); + ss->tx.send_stop = (volatile uint32_t *) + (sc->sram + MXGEFW_ETH_SEND_STOP + 64 * slice); + cmd.data0 = slice; err |= mxge_send_cmd(sc, MXGEFW_CMD_GET_SMALL_RX_OFFSET, &cmd); @@ -3276,10 +3340,16 @@ mxge_open(mxge_softc_t *sc) } /* Now give him the pointer to the stats block */ - cmd.data0 = MXGE_LOWPART_TO_U32(sc->ss->fw_stats_dma.bus_addr); - cmd.data1 = MXGE_HIGHPART_TO_U32(sc->ss->fw_stats_dma.bus_addr); - cmd.data2 = sizeof(struct mcp_irq_data); - err = mxge_send_cmd(sc, MXGEFW_CMD_SET_STATS_DMA_V2, &cmd); + for (slice = 0; slice < sc->num_slices; slice++) { + struct mxge_slice_state *ss = &sc->ss[slice]; + cmd.data0 = + MXGE_LOWPART_TO_U32(ss->fw_stats_dma.bus_addr); + cmd.data1 = + MXGE_HIGHPART_TO_U32(ss->fw_stats_dma.bus_addr); + cmd.data2 = sizeof(struct mcp_irq_data); + cmd.data2 |= (slice << 16); + err |= mxge_send_cmd(sc, MXGEFW_CMD_SET_STATS_DMA_V2, &cmd); + } if (err != 0) { bus = sc->ss->fw_stats_dma.bus_addr; @@ -3400,9 +3470,10 @@ mxge_read_reboot(mxge_softc_t *sc) } static int -mxge_watchdog_reset(mxge_softc_t *sc) +mxge_watchdog_reset(mxge_softc_t *sc, int slice) { struct pci_devinfo *dinfo; + mxge_tx_ring_t *tx; int err; uint32_t reboot; uint16_t cmd; @@ -3449,11 +3520,14 @@ mxge_watchdog_reset(mxge_softc_t *sc) err = mxge_open(sc); } } else { - device_printf(sc->dev, "NIC did not reboot, ring state:\n"); - device_printf(sc->dev, "tx.req=%d tx.done=%d\n", - sc->ss->tx.req, sc->ss->tx.done); + tx = &sc->ss[slice].tx; + device_printf(sc->dev, "NIC did not reboot, slice %d ring state:\n", slice); + device_printf(sc->dev, "tx.req=%d tx.done=%d, tx.queue_active=%d\n", + tx->req, tx->done, tx->queue_active); + device_printf(sc->dev, "tx.activate=%d tx.deactivate=%d\n", + tx->activate, tx->deactivate); device_printf(sc->dev, "pkt_done=%d fw=%d\n", - sc->ss->tx.pkt_done, + tx->pkt_done, be32toh(sc->ss->fw_stats->send_done_count)); device_printf(sc->dev, "not resetting\n"); } @@ -3463,26 +3537,29 @@ mxge_watchdog_reset(mxge_softc_t *sc) static int mxge_watchdog(mxge_softc_t *sc) { - mxge_tx_ring_t *tx = &sc->ss->tx; + mxge_tx_ring_t *tx; uint32_t rx_pause = be32toh(sc->ss->fw_stats->dropped_pause); - int err = 0; + int i, err = 0; /* see if we have outstanding transmits, which have been pending for more than mxge_ticks */ - if (tx->req != tx->done && - tx->watchdog_req != tx->watchdog_done && - tx->done == tx->watchdog_done) { - /* check for pause blocking before resetting */ - if (tx->watchdog_rx_pause == rx_pause) - err = mxge_watchdog_reset(sc); - else - device_printf(sc->dev, "Flow control blocking " - "xmits, check link partner\n"); - } + for (i = 0; (i < sc->num_slices) && (err == 0); i++) { + tx = &sc->ss[i].tx; + if (tx->req != tx->done && + tx->watchdog_req != tx->watchdog_done && + tx->done == tx->watchdog_done) { + /* check for pause blocking before resetting */ + if (tx->watchdog_rx_pause == rx_pause) + err = mxge_watchdog_reset(sc, i); + else + device_printf(sc->dev, "Flow control blocking " + "xmits, check link partner\n"); + } - tx->watchdog_req = tx->req; - tx->watchdog_done = tx->done; - tx->watchdog_rx_pause = rx_pause; + tx->watchdog_req = tx->req; + tx->watchdog_done = tx->done; + tx->watchdog_rx_pause = rx_pause; + } if (sc->need_media_probe) mxge_media_probe(sc); @@ -3494,15 +3571,27 @@ mxge_update_stats(mxge_softc_t *sc) { struct mxge_slice_state *ss; u_long ipackets = 0; + u_long opackets = 0; + u_long obytes = 0; + u_long omcasts = 0; + u_long oerrors = 0; int slice; - for(slice = 0; slice < sc->num_slices; slice++) { + for (slice = 0; slice < sc->num_slices; slice++) { ss = &sc->ss[slice]; ipackets += ss->ipackets; + opackets += ss->opackets; + obytes += ss->obytes; + omcasts += ss->omcasts; + oerrors += ss->oerrors; } sc->ifp->if_ipackets = ipackets; - + sc->ifp->if_opackets = opackets; + sc->ifp->if_obytes = obytes; + sc->ifp->if_omcasts = omcasts; + sc->ifp->if_oerrors = oerrors; } + static void mxge_tick(void *arg) { @@ -3725,6 +3814,7 @@ mxge_free_slices(mxge_softc_t *sc) mxge_dma_free(&ss->fw_stats_dma); ss->fw_stats = NULL; mtx_destroy(&ss->tx.mtx); + mtx_destroy(&ss->tx.ifq.ifq_mtx); } if (ss->rx_done.entry != NULL) { mxge_dma_free(&ss->rx_done.dma); @@ -3770,12 +3860,8 @@ mxge_alloc_slices(mxge_softc_t *sc) bzero(ss->rx_done.entry, bytes); /* - * allocate the per-slice firmware stats; stats - * (including tx) are used used only on the first - * slice for now + * allocate the per-slice firmware stats */ - if (i > 0) - continue; bytes = sizeof (*ss->fw_stats); err = mxge_dma_alloc(sc, &ss->fw_stats_dma, @@ -3786,6 +3872,9 @@ mxge_alloc_slices(mxge_softc_t *sc) snprintf(ss->tx.mtx_name, sizeof(ss->tx.mtx_name), "%s:tx(%d)", device_get_nameunit(sc->dev), i); mtx_init(&ss->tx.mtx, ss->tx.mtx_name, NULL, MTX_DEF); + snprintf(ss->tx.ifq_mtx_name, sizeof(ss->tx.mtx_name), + "%s:ifp(%d)", device_get_nameunit(sc->dev), i); + mtx_init(&ss->tx.ifq.ifq_mtx, ss->tx.ifq_mtx_name, NULL, MTX_DEF); } return (0); @@ -4247,7 +4336,7 @@ mxge_attach(device_t dev) ifp->if_softc = sc; ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST; ifp->if_ioctl = mxge_ioctl; - ifp->if_start = mxge_start; +/* ifp->if_start = mxge_start;*/ /* Initialise the ifmedia structure */ ifmedia_init(&sc->media, 0, mxge_media_change, mxge_media_status); @@ -4257,6 +4346,7 @@ mxge_attach(device_t dev) /* ether_ifattach sets mtu to 1500 */ if (ifp->if_capabilities & IFCAP_JUMBO_MTU) ifp->if_mtu = 9000; + ifp->if_transmit = mxge_transmit; mxge_add_sysctls(sc); return 0; Modified: user/kmacy/HEAD_fast_multi_xmit/sys/dev/mxge/if_mxge_var.h ============================================================================== --- user/kmacy/HEAD_fast_multi_xmit/sys/dev/mxge/if_mxge_var.h Fri Nov 7 22:36:19 2008 (r184757) +++ user/kmacy/HEAD_fast_multi_xmit/sys/dev/mxge/if_mxge_var.h Fri Nov 7 23:50:57 2008 (r184758) @@ -126,6 +126,8 @@ typedef struct { struct mtx mtx; volatile mcp_kreq_ether_send_t *lanai; /* lanai ptr for sendq */ + volatile uint32_t *send_go; /* doorbell for sendq */ + volatile uint32_t *send_stop; /* doorbell for sendq */ mcp_kreq_ether_send_t *req_list; /* host shadow of sendq */ char *req_bytes; bus_dma_segment_t *seg_list; @@ -136,13 +138,18 @@ typedef struct int done; /* transmits completed */ int pkt_done; /* packets completed */ int max_desc; /* max descriptors per xmit */ + int queue_active; /* fw currently polling this queue*/ + int activate; + int deactivate; int stall; /* #times hw queue exhausted */ int wake; /* #times irq re-enabled xmit */ int watchdog_req; /* cache of req */ int watchdog_done; /* cache of done */ int watchdog_rx_pause; /* cache of pause rq recvd */ int defrag; + struct ifaltq ifq; char mtx_name[16]; + char ifq_mtx_name[16]; } mxge_tx_ring_t; struct lro_entry; @@ -182,6 +189,11 @@ struct mxge_slice_state { mcp_irq_data_t *fw_stats; volatile uint32_t *irq_claim; u_long ipackets; + u_long opackets; + u_long obytes; + u_long omcasts; + u_long oerrors; + int if_drv_flags; struct lro_head lro_active; struct lro_head lro_free; int lro_queued; Added: user/kmacy/HEAD_fast_multi_xmit/sys/i386/conf/PERFTEST ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ user/kmacy/HEAD_fast_multi_xmit/sys/i386/conf/PERFTEST Fri Nov 7 23:50:57 2008 (r184758) @@ -0,0 +1,274 @@ +# +# GENERIC -- Generic kernel configuration file for FreeBSD/i386 +# +# For more information on this file, please read the handbook section on +# Kernel Configuration Files: +# +# http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-config.html +# +# The handbook is also available locally in /usr/share/doc/handbook +# if you've installed the doc distribution, otherwise always see the +# FreeBSD World Wide Web server (http://www.FreeBSD.org/) for the +# latest information. +# +# An exhaustive list of options and more detailed explanations of the +# device lines is also present in the ../../conf/NOTES and NOTES files. +# If you are in doubt as to the purpose or necessity of a line, check first +# in NOTES. +# +# $FreeBSD: user/kmacy/HEAD_fast_xmit/sys/i386/conf/GENERIC 183735 2008-10-09 21:25:01Z n_hibma $ + +cpu I686_CPU +ident GENERIC + +# To statically compile in device wiring instead of /boot/device.hints +#hints "GENERIC.hints" # Default places to look for devices. + +makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols + +options SCHED_ULE # ULE scheduler +options PREEMPTION # Enable kernel thread preemption +options INET # InterNETworking +options INET6 # IPv6 communications protocols +options SCTP # Stream Control Transmission Protocol +options FFS # Berkeley Fast Filesystem +options SOFTUPDATES # Enable FFS soft updates support +options UFS_ACL # Support for access control lists +options UFS_DIRHASH # Improve performance on big directories +options UFS_GJOURNAL # Enable gjournal-based UFS journaling +options MD_ROOT # MD is a potential root device +options NFSCLIENT # Network Filesystem Client +options NFSSERVER # Network Filesystem Server +options NFSLOCKD # Network Lock Manager +options NFS_ROOT # NFS usable as /, requires NFSCLIENT +options MSDOSFS # MSDOS Filesystem +options CD9660 # ISO 9660 Filesystem +options PROCFS # Process filesystem (requires PSEUDOFS) +options PSEUDOFS # Pseudo-filesystem framework +options GEOM_PART_GPT # GUID Partition Tables. +options GEOM_LABEL # Provides labelization +options COMPAT_43TTY # BSD 4.3 TTY compat [KEEP THIS!] +options COMPAT_FREEBSD4 # Compatible with FreeBSD4 +options COMPAT_FREEBSD5 # Compatible with FreeBSD5 +options COMPAT_FREEBSD6 # Compatible with FreeBSD6 +options COMPAT_FREEBSD7 # Compatible with FreeBSD7 +options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI +options KTRACE # ktrace(1) support +options STACK # stack(9) support +options SYSVSHM # SYSV-style shared memory +options SYSVMSG # SYSV-style message queues +options SYSVSEM # SYSV-style semaphores +options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions +options KBD_INSTALL_CDEV # install a CDEV entry in /dev +options STOP_NMI # Stop CPUS using NMI instead of IPI +options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4) +options AUDIT # Security event auditing + +# Debugging for use in -current +options KDB # Enable kernel debugger support. +options DDB # Support DDB. +options GDB # Support remote GDB. +#options INVARIANTS # Enable calls of extra sanity checking +#options INVARIANT_SUPPORT # Extra sanity checks of internal structures, required by INVARIANTS +#options WITNESS # Enable checks to detect deadlocks and cycles +#options WITNESS_SKIPSPIN # Don't run witness on spinlocks for speed +options RADIX_MPATH +options LOCK_PROFILING + + +# To make an SMP kernel, the next two lines are needed +options SMP # Symmetric MultiProcessor Kernel +device apic # I/O APIC + +# CPU frequency control +device cpufreq + +# Bus support. +device acpi +device eisa +device pci + +# Floppy drives +device fdc + +# ATA and ATAPI devices +device ata +device atadisk # ATA disk drives +device ataraid # ATA RAID drives +device atapicd # ATAPI CDROM drives +device atapifd # ATAPI floppy drives +device atapist # ATAPI tape drives +options ATA_STATIC_ID # Static device numbering + +# SCSI Controllers +device ahb # EISA AHA1742 family +device ahc # AHA2940 and onboard AIC7xxx devices +options AHC_REG_PRETTY_PRINT # Print register bitfields in debug + # output. Adds ~128k to driver. +device ahd # AHA39320/29320 and onboard AIC79xx devices +options AHD_REG_PRETTY_PRINT # Print register bitfields in debug + # output. Adds ~215k to driver. +device amd # AMD 53C974 (Tekram DC-390(T)) +device hptiop # Highpoint RocketRaid 3xxx series +device isp # Qlogic family +#device ispfw # Firmware for QLogic HBAs- normally a module +device mpt # LSI-Logic MPT-Fusion +#device ncr # NCR/Symbios Logic +device sym # NCR/Symbios Logic (newer chipsets + those of `ncr') +device trm # Tekram DC395U/UW/F DC315U adapters + +device adv # Advansys SCSI adapters +device adw # Advansys wide SCSI adapters +device aha # Adaptec 154x SCSI adapters +device aic # Adaptec 15[012]x SCSI adapters, AIC-6[23]60. +device bt # Buslogic/Mylex MultiMaster SCSI adapters + +device ncv # NCR 53C500 +device nsp # Workbit Ninja SCSI-3 +device stg # TMC 18C30/18C50 + +# SCSI peripherals +device scbus # SCSI bus (required for SCSI) +device ch # SCSI media changers +device da # Direct Access (disks) +device sa # Sequential Access (tape etc) +device cd # CD +device pass # Passthrough device (direct SCSI access) +device ses # SCSI Environmental Services (and SAF-TE) + +# RAID controllers interfaced to the SCSI subsystem +device amr # AMI MegaRAID +device arcmsr # Areca SATA II RAID +device asr # DPT SmartRAID V, VI and Adaptec SCSI RAID +device ciss # Compaq Smart RAID 5* +device dpt # DPT Smartcache III, IV - See NOTES for options +device hptmv # Highpoint RocketRAID 182x +device hptrr # Highpoint RocketRAID 17xx, 22xx, 23xx, 25xx +device iir # Intel Integrated RAID +device ips # IBM (Adaptec) ServeRAID +device mly # Mylex AcceleRAID/eXtremeRAID +device twa # 3ware 9000 series PATA/SATA RAID + +# RAID controllers +device aac # Adaptec FSA RAID +device aacp # SCSI passthrough for aac (requires CAM) +device ida # Compaq Smart RAID +device mfi # LSI MegaRAID SAS +device mlx # Mylex DAC960 family +device pst # Promise Supertrak SX6000 +device twe # 3ware ATA RAID + +# atkbdc0 controls both the keyboard and the PS/2 mouse +device atkbdc # AT keyboard controller +device atkbd # AT keyboard +device psm # PS/2 mouse + +device kbdmux # keyboard multiplexer + +device vga # VGA video card driver + +device splash # Splash screen and screen saver support + +# syscons is the default console driver, resembling an SCO console +device sc + +device agp # support several AGP chipsets + +# Power management support (see NOTES for more options) +#device apm *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***