From owner-freebsd-net@FreeBSD.ORG Sat May 11 15:59:02 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 529AAA1E for ; Sat, 11 May 2013 15:59:02 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from nm37-vm3.bullet.mail.ne1.yahoo.com (nm37-vm3.bullet.mail.ne1.yahoo.com [98.138.229.131]) by mx1.freebsd.org (Postfix) with ESMTP id EE397E0B for ; Sat, 11 May 2013 15:59:00 +0000 (UTC) Received: from [98.138.226.180] by nm37.bullet.mail.ne1.yahoo.com with NNFMP; 11 May 2013 15:56:38 -0000 Received: from [98.138.89.244] by tm15.bullet.mail.ne1.yahoo.com with NNFMP; 11 May 2013 15:56:37 -0000 Received: from [127.0.0.1] by omp1058.mail.ne1.yahoo.com with NNFMP; 11 May 2013 15:56:37 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 658195.68110.bm@omp1058.mail.ne1.yahoo.com Received: (qmail 70714 invoked by uid 60001); 11 May 2013 15:56:37 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1368287797; bh=PSXfxDGODSpbzcxPpHOC64S7zaKG7JAlslm3p2fZn+g=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=wFHME/MRJZf8xu9l5GzEUan0VbezMI0SjUGEBL6hCTEdENmfuD7kfIZoeY8RsjONsuW5iLv8oL9V52Y2qHur3s2ulpHRKOLHh+WhBirqKDdWAkr2hLV9R/CnQdUM5GjweON8zqgnfHlmdZ6NSNhbDT15hg8Xp35Cr5cF9Jg67wM= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=WJnNe/bGUxrRDV3TAtFKU4c+z2Xa2XZCW0WbMy2/hJzVo97WoK7ro4ee81PXz3LI6bOIYwhkKkEKFqmbr2KVTxu78CmOptgvi0gZprnHrbVPt4YrcLPdTsrqVCt1vuikSiyIDLNW7f37BHGWILWf7gJtsJyNyHGUme5zWXD93n8=; X-YMail-OSG: c9G26MAVM1k5sG1bdGOkqzvFvGR4EOUXC5i6oEKDafCceEC zX2P.vMo9_9iVeqCHz.5pm.xg47upvN80_djxaWG_A5vZ1X932M66.ltqGJi mtmPltsCvdGn.xLJmhbz3vQAoxJW_XbF3yQRGjZIM4F_wm_GvP.FBuBhOu1U 6kAXFsFpYV8tnA8HgdDsfEnE3rkHLNh4E0RUxPS6fw9I_mkQAzt4esOkQzmX eWa683lKU2a6Zd54aUr0heyRlN9Qa1aH9qp9JmeskuhzHA1hn2E6nQjLD3Jj P1iBgV5F6OqARc0840eJPr5wzdDtO4WPHtoA4dfVmvLBaID4KV8xC7vSg_LV WsLGKuUw5WBWKjLFhrhuRHi46Wo49vmsXla7C_bLTw_b80u_Pj2uHAO.CUo0 8liWpgHUZyw1uJ.1OAetD5J.pgo71j_IhQXPwM2N2l4JLTyB6pEpXUUnLPbf LuV836uHfZd62lNkz8qamfoL503KV6JtHarkfUg8UH03fD25O3N0q5.v6mIE l0e0hpt77JKV_aLNBDbiJXHOwnvuicA-- Received: from [98.203.118.124] by web121603.mail.ne1.yahoo.com via HTTP; Sat, 11 May 2013 08:56:37 PDT X-Rocket-MIMEInfo: 002.001, DQoNCi0tLSBPbiBGcmksIDUvMTAvMTMsIEV1Z2VuZSBHcm9zYmVpbiA8ZWdyb3NiZWluQHJkdGMucnU.IHdyb3RlOg0KDQo.IEZyb206IEV1Z2VuZSBHcm9zYmVpbiA8ZWdyb3NiZWluQHJkdGMucnU.DQo.IFN1YmplY3Q6IFJlOiBIaWdoIENQVSBpbnRlcnJ1cHQgbG9hZCBvbiBpbnRlbCBJMzUwVDQgd2l0aCBpZ2Igb24gOC4zDQo.IFRvOiAiQmFybmV5IENvcmRvYmEiIDxiYXJuZXlfY29yZG9iYUB5YWhvby5jb20.DQo.IENjOiBmcmVlYnNkLW5ldEBmcmVlYnNkLm9yZywgIiJDbMOpbWVudCBIZXJtYW5uICgBMAEBAQE- X-Mailer: YahooMailClassic/15.1.8 YahooMailWebService/0.8.141.536 Message-ID: <1368287797.70288.YahooMailClassic@web121603.mail.ne1.yahoo.com> Date: Sat, 11 May 2013 08:56:37 -0700 (PDT) From: Barney Cordoba Subject: Re: High CPU interrupt load on intel I350T4 with igb on 8.3 To: Eugene Grosbein In-Reply-To: <518CEE95.7020702@rdtc.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org, =?iso-8859-1?Q?Cl=E9ment_Hermann_=28nodens=29?= X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 May 2013 15:59:02 -0000 --- On Fri, 5/10/13, Eugene Grosbein wrote: > From: Eugene Grosbein > Subject: Re: High CPU interrupt load on intel I350T4 with igb on 8.3 > To: "Barney Cordoba" > Cc: freebsd-net@freebsd.org, ""Cl=E9ment Hermann (nodens)"" > Date: Friday, May 10, 2013, 8:56 AM > On 10.05.2013 05:16, Barney Cordoba > wrote: >=20 > >>>> Network device driver is not guilty here, > that's > >> just pf's > >>>> contention > >>>> running in igb's context. > >>> > >>> They're both at play. Single threadedness > aggravates > >> subsystems that=20 > >>> have too many lock points. > >>> > >>> It can also be "solved" with using 1 queue, > because > >> then you don't > >>> have 4 queues going into a single thread. > >> > >> Again, the problem is within pf(4)'s global lock, > not in the > >> igb(4). > >> > >=20 > > Again, you're wrong. It's not the bottleneck's fault; > it's the fault of=20 > > the multi-threaded code for only working properly when > there are no > > bottlenecks. >=20 > In practice, the problem is easily solved without any change > in the igb code. > The same problem will occur for other NIC drivers too - > if several NICs were combined within one lagg(4). So, driver > is not guilty and > solution would be same - eliminate bottleneck and you will > be fine and capable > to spread the load on several CPU cores. >=20 > Therefore, I don't care of CS theory for this particular > case. Clearly you don't understand the problem. Your logic is that because other drivers are defective also; therefore its not a driver problem? The problem is caused by a multi-threaded driver that haphazardly launches tasks and that doesn't manage the case that the rest of the system can't handle the load. It's no different than a driver that barfs when mbuf clusters are exhausted. The answer isn't to increase memory or mbufs, even though that may alleviate the problem. The answer is to fix the driver, so that it doesn't crash the system for an event that is wholly predictable= . igb has 1) too many locks and 2) exasperates the problem by binding to cpus, which causes it to not only have to wait for the lock to free, but=20 also for a specific cpu to become free. So it chugs along happily until=20 it encounters a bottleneck, at which point it quickly blows up the entire system in a domino effect. It needs to manage locks more efficiently, and also to detect when the backup is unmanageable. Ever since FreeBSD 5 the answer has been "it's fixed in 7, or its fixed in 9, or it's fixed in 10". There will always be bottlenecks, and no driver should blow up the system no matter what intermediate code may present a problem. Its the driver's responsibility to behave and to drop packets if necessary. BC