From owner-freebsd-net@freebsd.org Fri Jun 3 20:22:09 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 754D9B68182 for ; Fri, 3 Jun 2016 20:22:09 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vk0-x232.google.com (mail-vk0-x232.google.com [IPv6:2607:f8b0:400c:c05::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2DA671DFE; Fri, 3 Jun 2016 20:22:09 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: by mail-vk0-x232.google.com with SMTP id a6so129616756vkg.3; Fri, 03 Jun 2016 13:22:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=nVc3pzQKoroQyyihfhCAEvpITcqUsR4wZc3ByLNhfZU=; b=SdSR0NKKBXh3qDgGrUT8f5RADKb1kQIcBEDAninsj30rlMZECkmqU4DnZdt+4pLofY cVeo1nCM6L8hSLNqcsF2Qf2IlkMUBjrO1wb9eaxvHBDPe+uvCBK1HE3D3SzIYnsCJc23 B9MMwUhNsT7JahTDr6Jf8ZLmMk3x+QRXTh1Xi7bzzmQEjOIpaYoHwm8OdB4msApOnYKP i/2BmOWdatMSKDtrEhaleXTrT7bRKoaJ4CVixD/s6nuFYxTGLKcbOzKZZpdQGUYmJrqj YqP8GlFjgmWlAHhLZGCNYM5l/p1G48F6ZnQdeiXjU/K9hxPDDUMNMhF6x/Jq1DMvO4Cw 90tQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=nVc3pzQKoroQyyihfhCAEvpITcqUsR4wZc3ByLNhfZU=; b=btGgyTGUB1ly/KB7Za8+W7yYkSy8r+JID54vns6Ia64p/GOHgSLFvol2gXw5m9Usjc K1lf4FaAkA4zoWSsvv8Pjb07vejMEPAkA/NTvmhks/0aP3oRXeH2e7kow0gRCMBAkqR/ vH2YSHlEBcS2UpftkE1PjB0pTCs6361QcQLuGHoX/KYu5eoarEa6/oFTlJP8hPrQeZHq rRMRA7fLa4nl3Fycm86dgTzi+CYpTiDU3kvB9ZVpqecMsMEbuQdkLBpDdlexMqcztI9U A3JqoMZeQvNckV69qHrJnf5G8z6wrVAZnHNbw/dAw6YJSWI5TSZqwkvqRPVNAF+dRXbv nTVA== X-Gm-Message-State: ALyK8tLYeRMRpuLpI6tJhsz0JL0uMarv2DS7L5ke7T2afyiFBB/DmLtdDLnsKKw45DUSqediG+UQuZLFaJS2Kw== X-Received: by 10.159.37.70 with SMTP id 64mr2173973uaz.6.1464985328197; Fri, 03 Jun 2016 13:22:08 -0700 (PDT) MIME-Version: 1.0 Received: by 10.31.171.145 with HTTP; Fri, 3 Jun 2016 13:22:07 -0700 (PDT) In-Reply-To: References: <20160602202015.GG8994@strugglingcoder.info> <9A903EE5-3F2C-46C0-B563-1150F81E3507@juniper.net> <20160602214104.GJ8994@strugglingcoder.info> From: Jack Vogel Date: Fri, 3 Jun 2016 13:22:07 -0700 Message-ID: Subject: Re: Possible transmit/stats problem in igb driver. To: Sreekanth Rupavatharam Cc: hiren panchasara , "freebsd-net@freebsd.org" , "sbruno@FreeBSD.org" , "erj@FreeBSD.org" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jun 2016 20:22:09 -0000 That's an interesting theory, you could add a check into the tx path looking for a zero m_len and see, seems unlikely though :) Jack On Fri, Jun 3, 2016 at 1:15 PM, Sreekanth Rupavatharam wrote: > Wondering if this can happen if somehow the mbuf->m_len is not > correct(e.g., 0) and thus causing the dma to fail silently. The only way > this is happening if the arp request is larger than 64 bytes and the arp > response code is reusing the packet to send a 64 byte response. > > Thanks, > > -Sreekanth > > > On 6/2/16, 2:41 PM, "hiren panchasara" wrote: > > >+ Sean, Eric > > > >On 06/02/16 at 09:11P, Sreekanth Rupavatharam wrote: > >> Inline > >> > >> >Apart from stats, do you see anything else going wrong? i.e. do you > >> >actually see less packets (arp replies??) than expected? > >> > >> [SR] The packets are not going out on the wire. The tool doesn?t > receive the packets. That?s how I started noticing the issue. > >> > >> >Taking your example, tx_packets is something we count in the drivers > and > >> >total_pkts_txd is calculated in the card and we just read it off of it > >> >to report (E1000_TPT). > >> > >> [SR] Correct. My main question would be under what circumstance would > the packet handed off to hardware will *not* be transmitted?. Especially > considering there are no transmit errors or pause frames received. There > are no dma tx failures either. That?s the baffling part. I tried another > exercise where I used ping of various sizes going out, but that doesn?t > seem to trigger the problem. > >> > >> > >> >To understand your setup better, ixia is the sender and your box with > >> >igb(4) is the receiver and your are sending arp requests to it. > >> > >> Yes, correct. > >> > >> >Can you post following for working (size <= 64bytes) and non-working > >> >(size > 64bytes) cases for before/after? > >> > > >> >sysctl dev.igb | grep tx_packets > >> >sysctl dev.igb | grep total_pkts_txd > >> >sysctl dev.igb | grep rx_packets > >> >sysctl dev.igb | grep total_pkts_recvd > >> > >> > >> Before(not working): > >> dev.igb.1.queue0.tx_packets: 24907933 > >> dev.igb.1.queue0.rx_packets: 18086575 > >> dev.igb.1.mac_stats.total_pkts_recvd: 25057359 > >> dev.igb.1.mac_stats.total_pkts_txd: 16647169 > >> > >> After(not working): > >> dev.igb.1.queue0.tx_packets: 24913324 > >> dev.igb.1.queue0.rx_packets: 18091832 > >> dev.igb.1.mac_stats.total_pkts_recvd: 25062618 > >> dev.igb.1.mac_stats.total_pkts_txd: 16647545 > >> >netstat -sp arp > >> > >> The difference is 5391 for queue0.tx_packets but for > mac_stats.total_pkts_txd is 376 > >> Everything else is matching up. > >> > >> Before (working) > >> dev.igb.1.queue0.tx_packets: 25359165 > >> dev.igb.1.queue0.rx_packets: 18526094 > >> dev.igb.1.mac_stats.total_pkts_recvd: 25508763 > >> dev.igb.1.mac_stats.total_pkts_txd: 16831587 > >> > >> > >> After(working) > >> dev.igb.1.queue0.tx_packets: 25364597 > >> dev.igb.1.queue0.rx_packets: 18531398 > >> dev.igb.1.mac_stats.total_pkts_recvd: 25514009 > >> dev.igb.1.mac_stats.total_pkts_txd: 16836833 > >> > >> > >> Another interesting stat is > >> before_notworking:dev.igb.1.interrupts.tx_queue_empty: 16646890 > >> after_notworking:dev.igb.1.interrupts.tx_queue_empty: 16647266 > >> > >> The difference here is exactly 376 which is the number of packets that > the device actually claims to have transmitted. It?s as though it didn?t > see the other packets en-queued in the ring descriptor. > >> > > > >Very interesting. Do you tune defaults at all? What does sysctl hw.igb > >say? Not sure if bumping up txd would help. > > > >Adding Sean and Eric to throw some light. > > > >> > >> I can?t do netstat just for arp as these are coming in a tunnel(Packets > don?t? show up as arp on the interface). However, I did see the packet rate > was about 500 packets/sec > >> > > > >Cheers, > >Hiren > >