From owner-freebsd-net@FreeBSD.ORG Sun Jan 3 16:44:23 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D6C41065696 for ; Sun, 3 Jan 2010 16:44:23 +0000 (UTC) (envelope-from barney_cordoba@yahoo.com) Received: from web63902.mail.re1.yahoo.com (web63902.mail.re1.yahoo.com [69.147.97.117]) by mx1.freebsd.org (Postfix) with SMTP id 470448FC21 for ; Sun, 3 Jan 2010 16:44:23 +0000 (UTC) Received: (qmail 1473 invoked by uid 60001); 3 Jan 2010 16:44:18 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1262537058; bh=XOGjE0wtRPak89kmjmTgeHCpvmOFEc0n/Mbcchm9xAY=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding; b=vpy9Dn/W7Q3x3YcvD5R15Xxw+ld8N0ULBzdP+k/XjpSsecqHI5tmvxKjWnEnWhWsO67FQcd6tmMNvp1nJu1P+33srwigBeXsUaWByCd+E5K9TFO25hBRkpokVJpA1xvgW2SCH+JvOjJbHUUVq/69DNPFJUrmvzYLbgJMUUbxrig= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding; b=mFhpnTpDCadPAtW8drvmxcZa78ZT9Dl7ApcIDpNqOawUr+w10zLwekDxv4vud5h8lBB0sVSycyWhFjQyMfF7jyKV9a+jMC/LNlYSSWWAgNbX/IcTDmfnjsjdbsUuYEjOvGYRcHIFOzM2G3KHe0JxF1KcKR3xEvC38k2lGt5YMwg=; Message-ID: <591563.1459.qm@web63902.mail.re1.yahoo.com> X-YMail-OSG: 6aZQ3m0VM1kVzPXPjluzYlS4.h4067mqq8hj2La2h86VMy1UlVVtpqYbIOmIOgvqHodydToUt9pedG908ZdOzDJ9JIuPHSLhfUEWBBCEJCGhEt6IQGU8101pN3OWbLyXWb86qVG03aQ7LVfnnVP9BHqhS_7PNvfbHxnAFhZwcL.hmUGCajg._9ZgHHjY54WWy8_GSJ9yh9qjB0yN1c5TjKwqP2cRlbJjrWRVFbrVEoMe7u5buh2DpoCx.pWc9GqQVub8sziiJ.ZRKUVhmxQ1t2z7gIo56iijX_wY2u4Zh_Oa9bI6w.uLko.CXzU- Received: from [98.203.21.152] by web63902.mail.re1.yahoo.com via HTTP; Sun, 03 Jan 2010 08:44:18 PST X-Mailer: YahooMailClassic/9.0.20 YahooMailWebService/0.8.100.260964 Date: Sun, 3 Jan 2010 08:44:18 -0800 (PST) From: Barney Cordoba To: =?iso-8859-1?Q?Michael_T=FCxen?= MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org, jfvogel@gmail.com Subject: Re: igb interrupt moderation X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jan 2010 16:44:23 -0000 =0A--- On Sun, 1/3/10, Michael T=FCxen w= rote:=0A=0A> From: Michael T=FCxen =0A> S= ubject: Re: igb interrupt moderation=0A> To: "Barney Cordoba" =0A> Cc: freebsd-net@freebsd.org, jfvogel@gmail.com=0A> Date: = Sunday, January 3, 2010, 8:55 AM=0A> Hi Barney, Hi Jack,=0A> =0A> some comm= ents and some more questions inside...=0A> =0A> Best regards=0A> Michael=0A= > =0A> On Jan 2, 2010, at 8:42 PM, Barney Cordoba wrote:=0A> =0A> > Jack,= =0A> > =0A> > I'm trying to get some clarification on differences=0A> I'm f= inding between=0A> > the 82575 and 82576 parts with respect to interrupt=0A= > moderation. The spec=0A> > I have for the 82576 (82576_Datasheet_v2p1.pdf= )=0A> indicates that the =0A> I'm only commenting 82576. You can get rev 2.= 41 from intels=0A> website...=0A> > =0A> > ITR algorithm is different than = the one used (I don't=0A> have one of the=0A> > secret copies of the 82575 = spec). The algorithm shown=0A> is=0A> > =0A> > interrupts/sec =3D 1/(2 * 10= -6sec x interval) (page 295,=0A> Section 7.3.4)=0A> > =0A> > which is clear= ly wrong from practice. I have an 82576=0A> (device id 10C9)=0A> If you loo= k at section 8.8.12, you find other formulas...=0A> Jack: Which ones are co= rrect?=0A> > if I use the 125d setting in the example get just=0A> under 32= 000 interrupts=0A> > per second. Clearly your code doesnt implement this,= =0A> nor do you have=0A> > different settings for the 82575 and 82576 parts= . So I=0A> assume that the =0A> > same formula for the em parts hold for th= e igb parts,=0A> and that the =0A> > datasheet is wrong?=0A> > =0A> > There= does seem to be a slight difference. The setting=0A> that gets 1000=0A> > = ints/second on the 82575 generates about 1020 on the=0A> 82576. Not a big= =0A> > deal but I wonder why there's a difference? Is the=0A> reference clo= ck for=0A> > these something that may not be fixed and could vary=0A> from = board to =0A> > board? Note that both devices are on the same MB.=0A> > =0A= > > Also, it seems that settings to EITR over 32767 wrap=0A> on the 82576 (= for=0A> > example writing 32768 to EITR is the same as writing a=0A> 1). So= the=A0 minimum setting on the 82576 is around 125=0A> ints/second. The 825= 75 can accept =0A> > values up the 65535 before wrapping. =0A> Hmm, looking= at the table in 8.8.12 would suggest:=0A> Setting it to one sets a reserve= d bit, but does not change=0A> the interval.=0A> Setting it to 2^15 should = set the LLI_EN bit, but does not=0A> change in interval.=0A> =0A> Jack is s= etting the register to=0A> igb_low_latency: 128=0A> igb_ave_latency: 450=0A= > igb_bulk_latency: 1200=0A> =0A> This would result in intervals of:=0A> ig= b_low_latency: 32=0A> igb_ave_latency: 112=0A> igb_bulk_latency: 300=0A> Ja= ck: What are the corresponding interrupt rates? The spec=0A> provides diffe= rent=0A> =A0 =A0 =A0 formulas and talks about a 1us, 2us or=0A> 8us counter= . Not sure what is right...=0A> Jack: Why are you setting bit1 (which is re= served) in the=0A> case igb_ave_latency?=0A> =0A> And another question for = Jack:=0A> In igb_update_aim() you do=0A> =A0=A0=A0 if (olditr !=3D newitr) = {=0A> =A0=A0=A0 =A0=A0=A0 /* Change interrupt=0A> rate */=0A> =A0=A0=A0 =A0= =A0=A0 rxr->eitr_setting=0A> =3D newitr;=0A> =A0=A0=A0 =A0=A0=A0=0A> E1000_= WRITE_REG(&adapter->hw,=0A> E1000_EITR(rxr->me),=0A> =A0=A0=A0 =A0=A0=A0 = =A0 =A0 newitr=0A> | (newitr << 16));=0A> =A0=A0=A0 }=0A> So why are settin= g the higher bits of the EITR? You are=0A> setting=0A> igb_low_latency: the= LL Counter becomes 0, the moderation=0A> counter becomes 16=0A> igb_ave_la= tency: the LL Counter becomes 2, the moderation=0A> counter becomes 56=0A> = igb_bulk_latency: the LL Counter becomes 16, the moderation=0A> counter bec= omes 148=0A> =0A> I really do not understand these settings. Maybe the spec= =0A> is wrong? Or you do mean=0A> =A0=A0=A0 if (olditr !=3D newitr) {=0A> = =A0=A0=A0 =A0=A0=A0 /* Change interrupt=0A> rate */=0A> =A0=A0=A0 =A0=A0=A0= rxr->eitr_setting=0A> =3D newitr;=0A> =A0=A0=A0 =A0=A0=A0=0A> E1000_WRITE_= REG(&adapter->hw, E1000_EITR(rxr->me),=0A> newitr);=0A> =A0=A0=A0 }=0A> Or = do you want to preserve the counters, set the CNT_INGR=0A> bit and mean=0A>= =A0=A0=A0 if (olditr !=3D newitr) {=0A> =A0=A0=A0 =A0=A0=A0 /* Change inte= rrupt=0A> rate */=0A> =A0=A0=A0 =A0=A0=A0 rxr->eitr_setting=0A> =3D newitr;= =0A> =A0=A0=A0 =A0=A0=A0=0A> E1000_WRITE_REG(&adapter->hw, E1000_EITR(rxr->= me),=0A> 0x80000000 | newitr);=0A> =A0=A0=A0 }=0A> =0A> Could you clarify t= hat?=0A> > =0A> > The 82576 document doesn't have a map of the register=0A>= that I can find, so=0A> > Im curious as to whether these observations are= =0A> something I can assume is=0A> > true across all parts and motherboards= /cards, or is=0A> there some=0A> > implementation variance that will cause = these to only=0A> apply to the ones=0A> > I happen to be testing?=0A> > =0A= > > Thanks,=0A> > =0A> > Barney=0A=0AAh, the register map in the older spec= doesn't have the full or=0Acorrect information :\=0A=0ANote that ripping o= ut intel's auto-moderation was one of my first tasks,=0Aso I can't comment = on how how they derive those values (which are very different than in LINUX= ). Its supposed to be based on average packet=0Asize, so I'm not sure what = Jack is doing with some of these settings.=0AAs for the EITR settings,the d= atasheet is just plain wrong. For example in =0Asection 7.3.3.1 it says tha= t a setting of 125d would result in 8000=0Ainterrupts per second; in practi= ce that value results in about 32K=0Ainterrupts per second. The 82575 seems= to use the same algorithm=0Aas the em class devices:=0A=0A1,000,000,000 / = (256 * ints_per_sec)=A0 =0A=0Aso "low-latency" is 30,517 interrupts per sec= ond, while "bulk latency"=0Ais 3,255, which is way, way way too high.=A0 So= with 4 queues, you have a=0Aminimum of 13K interrupts per second. Its just= a concept that hasn't been=0A thought out or tested in practice. Its also = absolutely ridiculous to=0A adjust the moderation on every interrupt. You = can't interpret traffic=0A patterns in 1/8000th of a second. Also, notice t= hat the bulk threshold=0Ais 10,000 bytes, and the bulk setting is 3255 int/= sec. Well in order to=0Areceive 10,000 bytes in 1/3255th of a second, which= is 260Kb/s,=0Awith 4 queues, assuming reasonable distribution, you'd have = to be=0Areceiving at more than wire speed to stay in bulk for more than 1= =0Ainterrupt time. And how could the settings be the same for 1 queue as=0A= for 4 queues?=0A=0AI don't see any reference to what LLI Moderation Enable = bit might do. It=0Adoesn't seem to do anything; setting it or not setting i= t seems to =0Aresult in the same level of moderation.=0A=0ABarney=0A=0A=0A= =0A=0A