From owner-freebsd-questions@FreeBSD.ORG Sun Nov 16 23:58:41 2014 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C3AF0E9C; Sun, 16 Nov 2014 23:58:41 +0000 (UTC) Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [IPv6:2607:f3e0:0:1::12]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 60CD6C55; Sun, 16 Nov 2014 23:58:41 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.9/8.14.9) with ESMTP id sAGNweWC067503; Sun, 16 Nov 2014 18:58:40 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <54693A30.3040404@sentex.net> Date: Sun, 16 Nov 2014 18:58:40 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Adrian Chadd , FF Subject: Re: em0 tx_dma_fail incrementing [SOLVED] References: In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.75 Cc: "freebsd-questions@freebsd.org" X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2014 23:58:41 -0000 On 11/16/2014 12:28 PM, Adrian Chadd wrote: > Hi! > > Good catch! Would you mind filing a bug so we remember and > (hopefully!) fix it to be the default? > > https://bugs.freebsd.org/submit/ I wonder if this is the bug I was running into https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193802 ---Mike > > Thanks! > > > -adrian > > > On 15 November 2014 08:31, FF wrote: >> It looks like FreeBSD may be a victim of this bug: >> >> >> >> http://www.intel.com.au/content/dam/www/public/us/en/documents/specification-updates/82574-gbe-controller-spec-update.pdf >> >> >> >> 17. Tx Data Corruption When Using TCP Segmentation Offload >> >> Problem: When using TSO, a situation can occur where a PCIe MRd request is >> repeated with the >> >> same address, resulting in data corruption. At the end of the TCP packet, >> the Tx DMA >> >> hangs because the length doesn't match. This can only occur when the >> following are >> >> true: >> >> • The first buffer of the packet is larger than [3 * (max_read_request - >> 4)]. >> >> • There is a 4 KB boundary within 64 bytes following the end of the header >> bytes in >> >> the buffer >> >> Implication: Possible data corruption since a TCP packet is transmitted >> containing the wrong data but >> >> with the correct checksum. >> >> Data transmission halts as the Tx DMA module enters a hang state. >> >> Workaround: The failure can be avoided by ensuring at least one of the >> following: >> >> • The buffer containing the headers should not be larger than [3 * >> >> (max_read_request - 4)]. To meet this requirement even for the minimum >> value of >> >> 128 bytes for max_read_request, the buffer should not be larger than 372 >> bytes. >> >> • The alignment of the buffer containing the headers should be such that >> there is no >> >> 4 KB boundary within 64 bytes following the end of the header bytes. >> Assuming >> >> standard Ethernet/IP/TCP headers of 54 bytes, this means that the buffer >> should >> >> not start 54-118 bytes before a 4 KB boundary. For example, 128-byte >> alignment >> >> for this buffer could be used to fulfill this condition. >> >> This problem has not been reported when using an Intel Linux* or Windows* >> drivers. >> >> Current analysis shows it is very unlikely for a situation to exist that >> would cause the >> >> 82574 to be at risk for the errata when using the Intel Linux or Windows >> drivers. >> >> >> >> Linux and other distros seem to have fixed it. This could be getting >> exercised because FreeBSD recently changed the default buffer size above >> 256 for this driver. >> >> >> Since I didn't want to reboot to try the lower buffer size, I turned off >> TSO on all the machines that I'd checked that were actively incrementing >> tx_dma_fail for em interfaces then re-enabled their membership into the >> LACP. >> >> >> In brief testing, (few gigabits for a few minutes) tx_dma_fail has not >> incremented and throughput has not been negatively impacted (before vs >> after re-enable). >> >> >> This is so anyone else who is scratching their head about why em >> performance is terrible can solve it. >> >> >> Best, >> >> >> FF >> >> >> On Thu, Nov 13, 2014 at 1:52 PM, FF wrote: >> >>> >>> What knob do I need to turn to address this? >>> >>> This em0 is in an LACP bundle with an igb0 that isn't showing this problem. >>> >>> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.3.8 >>> dev.em.0.%driver: em >>> dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.GLAN >>> dev.em.0.%pnpinfo: vendor=0x8086 device=0x153b subvendor=0x15d9 >>> subdevice=0x153b class=0x020000 >>> dev.em.0.%parent: pci0 >>> dev.em.0.nvm: -1 >>> dev.em.0.debug: -1 >>> dev.em.0.fc: 3 >>> dev.em.0.rx_int_delay: 0 >>> dev.em.0.tx_int_delay: 66 >>> dev.em.0.rx_abs_int_delay: 66 >>> dev.em.0.tx_abs_int_delay: 66 >>> dev.em.0.itr: 488 >>> dev.em.0.rx_processing_limit: 100 >>> dev.em.0.eee_control: 1 >>> dev.em.0.link_irq: 0 >>> dev.em.0.mbuf_alloc_fail: 52 >>> dev.em.0.cluster_alloc_fail: 0 >>> dev.em.0.dropped: 0 >>> ** >>> dev.em.0.tx_dma_fail: 1834648 >>> dev.em.0.rx_overruns: 3109 >>> ** >>> dev.em.0.watchdog_timeouts: 0 >>> dev.em.0.device_control: 1209532992 >>> dev.em.0.rx_control: 67141634 >>> dev.em.0.fc_high_water: 23584 >>> dev.em.0.fc_low_water: 20552 >>> dev.em.0.queue0.txd_head: 577 >>> dev.em.0.queue0.txd_tail: 577 >>> dev.em.0.queue0.tx_irq: 0 >>> dev.em.0.queue0.no_desc_avail: 0 >>> dev.em.0.queue0.rxd_head: 967 >>> dev.em.0.queue0.rxd_tail: 966 >>> dev.em.0.queue0.rx_irq: 0 >>> dev.em.0.mac_stats.excess_coll: 0 >>> dev.em.0.mac_stats.single_coll: 0 >>> dev.em.0.mac_stats.multiple_coll: 0 >>> dev.em.0.mac_stats.late_coll: 0 >>> dev.em.0.mac_stats.collision_count: 0 >>> dev.em.0.mac_stats.symbol_errors: 0 >>> dev.em.0.mac_stats.sequence_errors: 0 >>> dev.em.0.mac_stats.defer_count: 0 >>> dev.em.0.mac_stats.missed_packets: 61094 >>> dev.em.0.mac_stats.recv_no_buff: 60008 >>> dev.em.0.mac_stats.recv_undersize: 0 >>> dev.em.0.mac_stats.recv_fragmented: 0 >>> dev.em.0.mac_stats.recv_oversize: 0 >>> dev.em.0.mac_stats.recv_jabber: 0 >>> dev.em.0.mac_stats.recv_errs: 0 >>> dev.em.0.mac_stats.crc_errs: 0 >>> dev.em.0.mac_stats.alignment_errs: 0 >>> dev.em.0.mac_stats.coll_ext_errs: 0 >>> dev.em.0.mac_stats.xon_recvd: 40226659 >>> dev.em.0.mac_stats.xon_txd: 2132 >>> dev.em.0.mac_stats.xoff_recvd: 40241216 >>> dev.em.0.mac_stats.xoff_txd: 2073563 >>> dev.em.0.mac_stats.total_pkts_recvd: 3219537541 >>> dev.em.0.mac_stats.good_pkts_recvd: 3139008594 >>> dev.em.0.mac_stats.bcast_pkts_recvd: 3953817 >>> dev.em.0.mac_stats.mcast_pkts_recvd: 607157 >>> dev.em.0.mac_stats.rx_frames_64: 0 >>> dev.em.0.mac_stats.rx_frames_65_127: 0 >>> dev.em.0.mac_stats.rx_frames_128_255: 0 >>> dev.em.0.mac_stats.rx_frames_256_511: 0 >>> dev.em.0.mac_stats.rx_frames_512_1023: 0 >>> dev.em.0.mac_stats.rx_frames_1024_1522: 0 >>> dev.em.0.mac_stats.good_octets_recvd: 3527296369841 >>> dev.em.0.mac_stats.good_octets_txd: 14348531993101 >>> dev.em.0.mac_stats.total_pkts_txd: 10735190291 >>> dev.em.0.mac_stats.good_pkts_txd: 10733114595 >>> dev.em.0.mac_stats.bcast_pkts_txd: 14 >>> dev.em.0.mac_stats.mcast_pkts_txd: 54334 >>> dev.em.0.mac_stats.tx_frames_64: 0 >>> dev.em.0.mac_stats.tx_frames_65_127: 0 >>> dev.em.0.mac_stats.tx_frames_128_255: 0 >>> dev.em.0.mac_stats.tx_frames_256_511: 0 >>> dev.em.0.mac_stats.tx_frames_512_1023: 0 >>> dev.em.0.mac_stats.tx_frames_1024_1522: 0 >>> dev.em.0.mac_stats.tso_txd: 902605586 >>> dev.em.0.mac_stats.tso_ctx_fail: 0 >>> dev.em.0.interrupts.asserts: 1392541431 >>> dev.em.0.interrupts.rx_pkt_timer: 0 >>> dev.em.0.interrupts.rx_abs_timer: 0 >>> dev.em.0.interrupts.tx_pkt_timer: 0 >>> dev.em.0.interrupts.tx_abs_timer: 0 >>> dev.em.0.interrupts.tx_queue_empty: 0 >>> dev.em.0.interrupts.tx_queue_min_thresh: 0 >>> dev.em.0.interrupts.rx_desc_min_thresh: 0 >>> dev.em.0.interrupts.rx_overrun: 0 >>> dev.em.0.wake: 0 >>> >>> dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.10 >>> dev.igb.0.%driver: igb >>> dev.igb.0.%location: slot=0 function=0 handle=\_SB_.PCI0.RP04.PXSX >>> dev.igb.0.%pnpinfo: vendor=0x8086 device=0x1533 subvendor=0x15d9 >>> subdevice=0x1533 class=0x020000 >>> dev.igb.0.%parent: pci5 >>> dev.igb.0.nvm: -1 >>> dev.igb.0.enable_aim: 1 >>> dev.igb.0.fc: 3 >>> dev.igb.0.rx_processing_limit: 100 >>> dev.igb.0.dmac: 0 >>> dev.igb.0.eee_disabled: 0 >>> dev.igb.0.link_irq: 33 >>> dev.igb.0.dropped: 0 >>> dev.igb.0.tx_dma_fail: 0 >>> dev.igb.0.rx_overruns: 0 >>> dev.igb.0.watchdog_timeouts: 0 >>> dev.igb.0.device_control: 1209795137 >>> dev.igb.0.rx_control: 71335938 >>> dev.igb.0.interrupt_mask: 4 >>> dev.igb.0.extended_int_mask: 2147483679 >>> dev.igb.0.tx_buf_alloc: 0 >>> dev.igb.0.rx_buf_alloc: 0 >>> dev.igb.0.fc_high_water: 31328 >>> dev.igb.0.fc_low_water: 31312 >>> dev.igb.0.queue0.no_desc_avail: 0 >>> dev.igb.0.queue0.tx_packets: 62464141 >>> dev.igb.0.queue0.rx_packets: 73012939 >>> dev.igb.0.queue0.rx_bytes: 22529663814 >>> dev.igb.0.queue0.lro_queued: 0 >>> dev.igb.0.queue0.lro_flushed: 0 >>> dev.igb.0.queue1.no_desc_avail: 0 >>> dev.igb.0.queue1.tx_packets: 404298046 >>> dev.igb.0.queue1.rx_packets: 307675818 >>> dev.igb.0.queue1.rx_bytes: 185919902229 >>> dev.igb.0.queue1.lro_queued: 0 >>> dev.igb.0.queue1.lro_flushed: 0 >>> dev.igb.0.queue2.no_desc_avail: 0 >>> dev.igb.0.queue2.tx_packets: 3441053015 >>> dev.igb.0.queue2.rx_packets: 5511826751 >>> dev.igb.0.queue2.rx_bytes: 3054219311510 >>> dev.igb.0.queue2.lro_queued: 0 >>> dev.igb.0.queue2.lro_flushed: 0 >>> dev.igb.0.queue3.no_desc_avail: 0 >>> dev.igb.0.queue3.tx_packets: 1047838830 >>> dev.igb.0.queue3.rx_packets: 1987495318 >>> dev.igb.0.queue3.rx_bytes: 2696179247028 >>> dev.igb.0.queue3.lro_queued: 0 >>> dev.igb.0.queue3.lro_flushed: 0 >>> dev.igb.0.mac_stats.excess_coll: 0 >>> dev.igb.0.mac_stats.single_coll: 0 >>> dev.igb.0.mac_stats.multiple_coll: 0 >>> dev.igb.0.mac_stats.late_coll: 0 >>> dev.igb.0.mac_stats.collision_count: 0 >>> dev.igb.0.mac_stats.symbol_errors: 0 >>> dev.igb.0.mac_stats.sequence_errors: 0 >>> dev.igb.0.mac_stats.defer_count: 283811 >>> dev.igb.0.mac_stats.missed_packets: 9449 >>> dev.igb.0.mac_stats.recv_no_buff: 340 >>> dev.igb.0.mac_stats.recv_undersize: 0 >>> dev.igb.0.mac_stats.recv_fragmented: 0 >>> dev.igb.0.mac_stats.recv_oversize: 0 >>> dev.igb.0.mac_stats.recv_jabber: 0 >>> dev.igb.0.mac_stats.recv_errs: 0 >>> dev.igb.0.mac_stats.crc_errs: 0 >>> dev.igb.0.mac_stats.alignment_errs: 0 >>> dev.igb.0.mac_stats.coll_ext_errs: 0 >>> dev.igb.0.mac_stats.xon_recvd: 46255557 >>> dev.igb.0.mac_stats.xon_txd: 261 >>> dev.igb.0.mac_stats.xoff_recvd: 46255994 >>> dev.igb.0.mac_stats.xoff_txd: 7027 >>> dev.igb.0.mac_stats.total_pkts_recvd: 7975033582 >>> dev.igb.0.mac_stats.good_pkts_recvd: 7880001465 >>> dev.igb.0.mac_stats.bcast_pkts_recvd: 5783868 >>> dev.igb.0.mac_stats.mcast_pkts_recvd: 563315 >>> dev.igb.0.mac_stats.rx_frames_64: 28412906 >>> dev.igb.0.mac_stats.rx_frames_65_127: 3310187919 >>> dev.igb.0.mac_stats.rx_frames_128_255: 784920450 >>> dev.igb.0.mac_stats.rx_frames_256_511: 17225962 >>> dev.igb.0.mac_stats.rx_frames_512_1023: 73415350 >>> dev.igb.0.mac_stats.rx_frames_1024_1522: 3665838878 >>> dev.igb.0.mac_stats.good_octets_recvd: 5990356613544 >>> dev.igb.0.mac_stats.good_octets_txd: 46326753008181 >>> dev.igb.0.mac_stats.total_pkts_txd: 33016014138 >>> dev.igb.0.mac_stats.good_pkts_txd: 33016006850 >>> dev.igb.0.mac_stats.bcast_pkts_txd: 834 >>> dev.igb.0.mac_stats.mcast_pkts_txd: 54331 >>> dev.igb.0.mac_stats.tx_frames_64: 30741691 >>> dev.igb.0.mac_stats.tx_frames_65_127: 2174824217 >>> dev.igb.0.mac_stats.tx_frames_128_255: 139804927 >>> dev.igb.0.mac_stats.tx_frames_256_511: 59190261 >>> dev.igb.0.mac_stats.tx_frames_512_1023: 386886648 >>> dev.igb.0.mac_stats.tx_frames_1024_1522: 30224559106 >>> dev.igb.0.mac_stats.tso_txd: 2384636909 >>> dev.igb.0.mac_stats.tso_ctx_fail: 0 >>> dev.igb.0.interrupts.asserts: 4556119857 >>> dev.igb.0.interrupts.rx_pkt_timer: 7879778770 >>> dev.igb.0.interrupts.rx_abs_timer: 0 >>> dev.igb.0.interrupts.tx_pkt_timer: 0 >>> dev.igb.0.interrupts.tx_abs_timer: 0 >>> dev.igb.0.interrupts.tx_queue_empty: 33015268817 >>> dev.igb.0.interrupts.tx_queue_min_thresh: 7880001470 >>> dev.igb.0.interrupts.rx_desc_min_thresh: 0 >>> dev.igb.0.interrupts.rx_overrun: 0 >>> dev.igb.0.host.breaker_tx_pkt: 0 >>> dev.igb.0.host.host_tx_pkt_discard: 0 >>> dev.igb.0.host.rx_pkt: 222702 >>> dev.igb.0.host.breaker_rx_pkts: 0 >>> dev.igb.0.host.breaker_rx_pkt_drop: 0 >>> dev.igb.0.host.tx_good_pkt: 738033 >>> dev.igb.0.host.breaker_tx_pkt_drop: 0 >>> dev.igb.0.host.rx_good_bytes: 5990357073320 >>> dev.igb.0.host.tx_good_bytes: 46326753008181 >>> dev.igb.0.host.length_errors: 0 >>> dev.igb.0.host.serdes_violation_pkt: 0 >>> dev.igb.0.host.header_redir_missed: 0 >>> dev.igb.0.wake: 0 >>> >>> >>> hw.em.eee_setting: 1 >>> hw.em.rx_process_limit: 100 >>> hw.em.enable_msix: 1 >>> hw.em.sbp: 0 >>> hw.em.smart_pwr_down: 0 >>> hw.em.txd: 1024 >>> hw.em.rxd: 1024 >>> hw.em.rx_abs_int_delay: 66 >>> hw.em.tx_abs_int_delay: 66 >>> hw.em.rx_int_delay: 0 >>> hw.em.tx_int_delay: 66 >>> >>> hw.igb.rx_process_limit: 100 >>> hw.igb.num_queues: 0 >>> hw.igb.header_split: 0 >>> hw.igb.buf_ring_size: 4096 >>> hw.igb.max_interrupt_rate: 8000 >>> hw.igb.enable_msix: 1 >>> hw.igb.enable_aim: 1 >>> hw.igb.txd: 1024 >>> hw.igb.rxd: 1024 >>> >>> FreeBSD systemname.com 9.2-RELEASE-p10 FreeBSD 9.2-RELEASE-p10 #0 >>> r270148M: Mon Aug 18 23:14:36 EDT 2014 root@peta108:/usr/obj/usr/src/sys/CUSTOM10 >>> amd64 >>> >>> em0: flags=8843 metric 0 mtu 1500 >>> >>> options=4019b >>> ether 00:25:90:f2:2d:24 >>> inet6 fe80::225:90ff:fef2:2d24%em0 prefixlen 64 scopeid 0x2 >>> nd6 options=29 >>> media: Ethernet autoselect (1000baseT ) >>> status: active >>> igb0: flags=8843 metric 0 mtu 1500 >>> >>> options=401bb >>> ether 00:25:90:f2:2d:24 >>> inet6 fe80::225:90ff:fef2:2d25%igb0 prefixlen 64 scopeid 0x4 >>> nd6 options=29 >>> media: Ethernet autoselect (1000baseT ) >>> status: active >>> lo0: flags=8049 metric 0 mtu 16384 >>> options=600003 >>> inet6 ::1 prefixlen 128 >>> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x7 >>> inet 127.0.0.1 netmask 0xff000000 >>> nd6 options=21 >>> lagg0: flags=8843 metric 0 mtu 1500 >>> >>> options=4019b >>> ether 00:25:90:f2:2d:24 >>> inet 192.168.0.108 netmask 0xffffff00 broadcast 192.168.0.255 >>> inet6 fe80::225:90ff:fef2:2d24%lagg0 prefixlen 64 scopeid 0x8 >>> nd6 options=29 >>> media: Ethernet autoselect >>> status: active >>> laggproto lacp lagghash l2,l3,l4 >>> laggport: igb0 flags=1c >>> laggport: em0 flags=1c >>> >>> Thanks in advance! >>> >>> -- >>> FF >>> >> >> >> >> -- >> FF >> _______________________________________________ >> freebsd-questions@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-questions >> To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org" > > -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/