From owner-freebsd-net@freebsd.org Thu Jun 9 05:04:05 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6AF4BB6EE96 for ; Thu, 9 Jun 2016 05:04:05 +0000 (UTC) (envelope-from rupavath@juniper.net) Received: from na01-bl2-obe.outbound.protection.outlook.com (mail-bl2on0116.outbound.protection.outlook.com [65.55.169.116]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT SSL SHA2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BF588178E; Thu, 9 Jun 2016 05:04:04 +0000 (UTC) (envelope-from rupavath@juniper.net) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=junipernetworks.onmicrosoft.com; s=selector1-juniper-net; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=1Tb7TvBo30AHWxIVrfSC9GFwYif84rqOA2s1+DBucBQ=; b=e3LB/6U7EMgxCJsd77Ugs3WxRESHIHV0JH/R4dqj7E3yC64fjpJBMQCB39R70oo3X2FQjKQJMOVfclcrVWGlzYlyQUCZVlT4vz0xCiVoDZFSUaWgLNSS0dd8sA23i521XPAcEmcWsG5N6vfI/UED1eOEZ9YVMeSFcvEP4BeupL4= Received: from DM2PR0501MB1152.namprd05.prod.outlook.com (10.160.245.154) by DM2PR0501MB1149.namprd05.prod.outlook.com (10.160.245.151) with Microsoft SMTP Server (TLS) id 15.1.506.9; Thu, 9 Jun 2016 04:31:22 +0000 Received: from DM2PR0501MB1152.namprd05.prod.outlook.com ([10.160.245.154]) by DM2PR0501MB1152.namprd05.prod.outlook.com ([10.160.245.154]) with mapi id 15.01.0506.016; Thu, 9 Jun 2016 04:31:22 +0000 From: Sreekanth Rupavatharam To: Jack Vogel CC: hiren panchasara , "freebsd-net@freebsd.org" , "sbruno@FreeBSD.org" , "erj@FreeBSD.org" Subject: Re: Possible transmit/stats problem in igb driver. Thread-Topic: Possible transmit/stats problem in igb driver. Thread-Index: AQHRvEO2jFZU+cZVnkCsgDAxQb6jLZ/Wn9GA//+Y+oCAAH2aAIABBQoAgAB3PICACGRasQ== Date: Thu, 9 Jun 2016 04:31:22 +0000 Message-ID: References: <20160602202015.GG8994@strugglingcoder.info> <9A903EE5-3F2C-46C0-B563-1150F81E3507@juniper.net> <20160602214104.GJ8994@strugglingcoder.info> , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rupavath@juniper.net; x-originating-ip: [98.207.238.71] x-ms-office365-filtering-correlation-id: 1881d48e-c0a7-4647-194d-08d3901ee94d x-microsoft-exchange-diagnostics: 1; DM2PR0501MB1149; 5:B6SFBkliHaD+byz7spwiV8K8OHz69ULS3vVqyz12Og68yEO0VJYjr1B254rMZzq6Oecp0cXYIcndM5E49rigKMa5BYe1gxFAXIiN9mt3k0SLtgmUhCeAfn1cXo2nVyElmjal+GaK+eqVUHw5fQuuNA==; 24:Hekba9qRp4NaLdt2zKEQdCnE88tL1nRnx6tS+akQLS+XG0iah9vLOhMXbKHW3ggN9U87GG1U0FMw4/JeYX2clDc+lYd4BfecqQs6nWZZCGI=; 7:7TtiireaHHsrMmRDEnNRz24KxGFL0PQTEIsdvzOEwoslw4cNKTPqf/cuCau3yXfVtVy52sOLMOf0cXwYm/jNaTjYjwsVFoqC5Fl32O6VYdb/xq960zKcN1F19Cea5zclC5qNBub7pHy4h9hgJNsH6SsaoVbHnWDAZJbDv6/2AS6XodQqHCXom2c7esz3Nj7DbDmWapdMzVKKjgbAdWj20A== x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DM2PR0501MB1149; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(138986009662008)(17755550239193); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6055026); SRVR:DM2PR0501MB1149; BCL:0; PCL:0; RULEID:; SRVR:DM2PR0501MB1149; x-forefront-prvs: 0968D37274 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(24454002)(377454003)(199003)(189002)(33656002)(101416001)(5002640100001)(3280700002)(54356999)(76176999)(3846002)(3660700001)(68736007)(50986999)(86362001)(97736004)(110136002)(2950100001)(77096005)(189998001)(82746002)(99286002)(4326007)(83716003)(8936002)(93886004)(16236675004)(19580405001)(87936001)(102836003)(2900100001)(36756003)(106116001)(1411001)(11100500001)(105586002)(106356001)(66066001)(19580395003)(10400500002)(5008740100001)(586003)(92566002)(2906002)(81166006)(8676002)(122556002)(81156014)(5004730100002)(6116002)(104396002)(6606295002); DIR:OUT; SFP:1102; SCL:1; SRVR:DM2PR0501MB1149; H:DM2PR0501MB1152.namprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: juniper.net does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: juniper.net X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Jun 2016 04:31:22.5840 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR0501MB1149 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jun 2016 05:04:05 -0000 Well, that wasn't the issue. However there are some other details. The devi= ce is DH8900CC(0x8086:0x43a) quad nic serdes interface. The issue happens when th= e device is used in passthrough mode inside a VM. The guest OS is running F= reeBSD 10.1 and the host is Linux. There is no easy way to run this test in= bare metal mode. Another point I confirmed is that the descriptor is consu= med by the hardware(I get igb_txeof calls for the packets). The issue is no= t happening in the previously unified em driver(before igb driver was creat= ed) Thanks, -Sreekanth On Jun 3, 2016, at 1:22 PM, Jack Vogel > wrote: That's an interesting theory, you could add a check into the tx path lookin= g for a zero m_len and see, seems unlikely though :) Jack On Fri, Jun 3, 2016 at 1:15 PM, Sreekanth Rupavatharam > wrote: Wondering if this can happen if somehow the mbuf->m_len is not correct(e.g.= , 0) and thus causing the dma to fail silently. The only way this is happen= ing if the arp request is larger than 64 bytes and the arp response code is= reusing the packet to send a 64 byte response. Thanks, -Sreekanth On 6/2/16, 2:41 PM, "hiren panchasara" > wrote: >+ Sean, Eric > >On 06/02/16 at 09:11P, Sreekanth Rupavatharam wrote: >> Inline >> >> >Apart from stats, do you see anything else going wrong? i.e. do you >> >actually see less packets (arp replies??) than expected? >> >> [SR] The packets are not going out on the wire. The tool doesn?t receive= the packets. That?s how I started noticing the issue. >> >> >Taking your example, tx_packets is something we count in the drivers an= d >> >total_pkts_txd is calculated in the card and we just read it off of it >> >to report (E1000_TPT). >> >> [SR] Correct. My main question would be under what circumstance would th= e packet handed off to hardware will *not* be transmitted?. Especially cons= idering there are no transmit errors or pause frames received. There are no= dma tx failures either. That?s the baffling part. I tried another exercise= where I used ping of various sizes going out, but that doesn?t seem to tri= gger the problem. >> >> >> >To understand your setup better, ixia is the sender and your box with >> >igb(4) is the receiver and your are sending arp requests to it. >> >> Yes, correct. >> >> >Can you post following for working (size <=3D 64bytes) and non-working >> >(size > 64bytes) cases for before/after? >> > >> >sysctl dev.igb | grep tx_packets >> >sysctl dev.igb | grep total_pkts_txd >> >sysctl dev.igb | grep rx_packets >> >sysctl dev.igb | grep total_pkts_recvd >> >> >> Before(not working): >> dev.igb.1.queue0.tx_packets: 24907933 >> dev.igb.1.queue0.rx_packets: 18086575 >> dev.igb.1.mac_stats.total_pkts_recvd: 25057359 >> dev.igb.1.mac_stats.total_pkts_txd: 16647169 >> >> After(not working): >> dev.igb.1.queue0.tx_packets: 24913324 >> dev.igb.1.queue0.rx_packets: 18091832 >> dev.igb.1.mac_stats.total_pkts_recvd: 25062618 >> dev.igb.1.mac_stats.total_pkts_txd: 16647545 >> >netstat -sp arp >> >> The difference is 5391 for queue0.tx_packets but for mac_stats.total_pk= ts_txd is 376 >> Everything else is matching up. >> >> Before (working) >> dev.igb.1.queue0.tx_packets: 25359165 >> dev.igb.1.queue0.rx_packets: 18526094 >> dev.igb.1.mac_stats.total_pkts_recvd: 25508763 >> dev.igb.1.mac_stats.total_pkts_txd: 16831587 >> >> >> After(working) >> dev.igb.1.queue0.tx_packets: 25364597 >> dev.igb.1.queue0.rx_packets: 18531398 >> dev.igb.1.mac_stats.total_pkts_recvd: 25514009 >> dev.igb.1.mac_stats.total_pkts_txd: 16836833 >> >> >> Another interesting stat is >> before_notworking:dev.igb.1.interrupts.tx_queue_empty: 16646890 >> after_notworking:dev.igb.1.interrupts.tx_queue_empty: 16647266 >> >> The difference here is exactly 376 which is the number of packets that t= he device actually claims to have transmitted. It?s as though it didn?t see= the other packets en-queued in the ring descriptor. >> > >Very interesting. Do you tune defaults at all? What does sysctl hw.igb >say? Not sure if bumping up txd would help. > >Adding Sean and Eric to throw some light. > >> >> I can?t do netstat just for arp as these are coming in a tunnel(Packets = don?t? show up as arp on the interface). However, I did see the packet rate= was about 500 packets/sec >> > >Cheers, >Hiren