Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 Jun 2016 04:31:22 +0000
From:      Sreekanth Rupavatharam <rupavath@juniper.net>
To:        Jack Vogel <jfvogel@gmail.com>
Cc:        hiren panchasara <hiren@strugglingcoder.info>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, "sbruno@FreeBSD.org" <sbruno@freebsd.org>, "erj@FreeBSD.org" <erj@freebsd.org>
Subject:   Re: Possible transmit/stats problem in igb driver.
Message-ID:  <CB766DAC-764E-4E32-8FDD-36390D020EC3@juniper.net>
In-Reply-To: <CAFOYbcmi_q6DF1uDiyw%2B1D6r37V3%2BYtcb1Sfz%2Btf7ZJVsWMnzg@mail.gmail.com>
References:  <D7944476-98AD-4548-99E3-6E88648E2B06@juniper.net> <20160602202015.GG8994@strugglingcoder.info> <9A903EE5-3F2C-46C0-B563-1150F81E3507@juniper.net> <20160602214104.GJ8994@strugglingcoder.info> <F4049293-19AC-47F1-B95E-21749754CC3B@juniper.net>, <CAFOYbcmi_q6DF1uDiyw%2B1D6r37V3%2BYtcb1Sfz%2Btf7ZJVsWMnzg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Well, that wasn't the issue. However there are some other details. The devi=
ce is
DH8900CC(0x8086:0x43a) quad nic serdes interface. The issue happens when th=
e device is used in passthrough mode inside a VM. The guest OS is running F=
reeBSD 10.1 and the host is Linux. There is no easy way to run this test in=
 bare metal mode. Another point I confirmed is that the descriptor is consu=
med by the hardware(I get igb_txeof calls for the packets). The issue is no=
t happening in the previously unified em driver(before igb driver was creat=
ed)
Thanks,

-Sreekanth

On Jun 3, 2016, at 1:22 PM, Jack Vogel <jfvogel@gmail.com<mailto:jfvogel@gm=
ail.com>> wrote:

That's an interesting theory, you could add a check into the tx path lookin=
g for a zero m_len and see, seems unlikely though :)

Jack



On Fri, Jun 3, 2016 at 1:15 PM, Sreekanth Rupavatharam <rupavath@juniper.ne=
t<mailto:rupavath@juniper.net>> wrote:
Wondering if this can happen if somehow the mbuf->m_len is not correct(e.g.=
, 0) and thus causing the dma to fail silently. The only way this is happen=
ing if the arp request is larger than 64 bytes and the arp response code is=
 reusing the packet to send a 64 byte response.

Thanks,

-Sreekanth


On 6/2/16, 2:41 PM, "hiren panchasara" <hiren@strugglingcoder.info<mailto:h=
iren@strugglingcoder.info>> wrote:

>+ Sean, Eric
>
>On 06/02/16 at 09:11P, Sreekanth Rupavatharam wrote:
>> Inline
>>
>> >Apart from stats, do you see anything else going wrong? i.e. do you
>> >actually see less packets (arp replies??) than expected?
>>
>> [SR] The packets are not going out on the wire. The tool doesn?t receive=
 the packets. That?s how I started noticing the issue.
>>
>> >Taking your example, tx_packets is something we count in the drivers an=
d
>> >total_pkts_txd is calculated in the card and we just read it off of it
>> >to report (E1000_TPT).
>>
>> [SR] Correct. My main question would be under what circumstance would th=
e packet handed off to hardware will *not* be transmitted?. Especially cons=
idering there are no transmit errors or pause frames received. There are no=
 dma tx failures either. That?s the baffling part. I tried another exercise=
 where I used ping of various sizes going out, but that doesn?t seem to tri=
gger the problem.
>>
>>
>> >To understand your setup better, ixia is the sender and your box with
>> >igb(4) is the receiver and your are sending arp requests to it.
>>
>> Yes, correct.
>>
>> >Can you post following for working (size <=3D 64bytes) and non-working
>> >(size > 64bytes) cases for before/after?
>> >
>> >sysctl dev.igb | grep tx_packets
>> >sysctl dev.igb | grep total_pkts_txd
>> >sysctl dev.igb | grep rx_packets
>> >sysctl dev.igb | grep total_pkts_recvd
>>
>>
>> Before(not working):
>> dev.igb.1.queue0.tx_packets: 24907933
>> dev.igb.1.queue0.rx_packets: 18086575
>> dev.igb.1.mac_stats.total_pkts_recvd: 25057359
>> dev.igb.1.mac_stats.total_pkts_txd: 16647169
>>
>> After(not working):
>> dev.igb.1.queue0.tx_packets: 24913324
>> dev.igb.1.queue0.rx_packets: 18091832
>> dev.igb.1.mac_stats.total_pkts_recvd: 25062618
>> dev.igb.1.mac_stats.total_pkts_txd: 16647545
>> >netstat -sp arp
>>
>> The difference is  5391 for queue0.tx_packets but for mac_stats.total_pk=
ts_txd  is 376
>> Everything else is matching up.
>>
>> Before (working)
>> dev.igb.1.queue0.tx_packets: 25359165
>> dev.igb.1.queue0.rx_packets: 18526094
>> dev.igb.1.mac_stats.total_pkts_recvd: 25508763
>> dev.igb.1.mac_stats.total_pkts_txd: 16831587
>>
>>
>> After(working)
>> dev.igb.1.queue0.tx_packets: 25364597
>> dev.igb.1.queue0.rx_packets: 18531398
>> dev.igb.1.mac_stats.total_pkts_recvd: 25514009
>> dev.igb.1.mac_stats.total_pkts_txd: 16836833
>>
>>
>> Another interesting stat is
>> before_notworking:dev.igb.1.interrupts.tx_queue_empty: 16646890
>> after_notworking:dev.igb.1.interrupts.tx_queue_empty: 16647266
>>
>> The difference here is exactly 376 which is the number of packets that t=
he device actually claims to have transmitted. It?s as though it didn?t see=
 the other packets en-queued in the ring descriptor.
>>
>
>Very interesting. Do you tune defaults at all? What does sysctl hw.igb
>say? Not sure if bumping up txd would help.
>
>Adding Sean and Eric to throw some light.
>
>>
>> I can?t do netstat just for arp as these are coming in a tunnel(Packets =
don?t? show up as arp on the interface). However, I did see the packet rate=
 was about 500 packets/sec
>>
>
>Cheers,
>Hiren





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CB766DAC-764E-4E32-8FDD-36390D020EC3>