Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 3 Jan 2010 08:44:18 -0800 (PST)
From:      Barney Cordoba <barney_cordoba@yahoo.com>
To:        =?iso-8859-1?Q?Michael_T=FCxen?= <Michael.Tuexen@lurchi.franken.de>
Cc:        freebsd-net@freebsd.org, jfvogel@gmail.com
Subject:   Re: igb interrupt moderation
Message-ID:  <591563.1459.qm@web63902.mail.re1.yahoo.com>

next in thread | raw e-mail | index | archive | help
=0A--- On Sun, 1/3/10, Michael T=FCxen <Michael.Tuexen@lurchi.franken.de> w=
rote:=0A=0A> From: Michael T=FCxen <Michael.Tuexen@lurchi.franken.de>=0A> S=
ubject: Re: igb interrupt moderation=0A> To: "Barney Cordoba" <barney_cordo=
ba@yahoo.com>=0A> Cc: freebsd-net@freebsd.org, jfvogel@gmail.com=0A> Date: =
Sunday, January 3, 2010, 8:55 AM=0A> Hi Barney, Hi Jack,=0A> =0A> some comm=
ents and some more questions inside...=0A> =0A> Best regards=0A> Michael=0A=
> =0A> On Jan 2, 2010, at 8:42 PM, Barney Cordoba wrote:=0A> =0A> > Jack,=
=0A> > =0A> > I'm trying to get some clarification on differences=0A> I'm f=
inding between=0A> > the 82575 and 82576 parts with respect to interrupt=0A=
> moderation. The spec=0A> > I have for the 82576 (82576_Datasheet_v2p1.pdf=
)=0A> indicates that the =0A> I'm only commenting 82576. You can get rev 2.=
41 from intels=0A> website...=0A> > =0A> > ITR algorithm is different than =
the one used (I don't=0A> have one of the=0A> > secret copies of the 82575 =
spec). The algorithm shown=0A> is=0A> > =0A> > interrupts/sec =3D 1/(2 * 10=
-6sec x interval) (page 295,=0A> Section 7.3.4)=0A> > =0A> > which is clear=
ly wrong from practice. I have an 82576=0A> (device id 10C9)=0A> If you loo=
k at section 8.8.12, you find other formulas...=0A> Jack: Which ones are co=
rrect?=0A> > if I use the 125d setting in the example get just=0A> under 32=
000 interrupts=0A> > per second. Clearly your code doesnt implement this,=
=0A> nor do you have=0A> > different settings for the 82575 and 82576 parts=
. So I=0A> assume that the =0A> > same formula for the em parts hold for th=
e igb parts,=0A> and that the =0A> > datasheet is wrong?=0A> > =0A> > There=
 does seem to be a slight difference. The setting=0A> that gets 1000=0A> > =
ints/second on the 82575 generates about 1020 on the=0A> 82576. Not a big=
=0A> > deal but I wonder why there's a difference? Is the=0A> reference clo=
ck for=0A> > these something that may not be fixed and could vary=0A> from =
board to =0A> > board? Note that both devices are on the same MB.=0A> > =0A=
> > Also, it seems that settings to EITR over 32767 wrap=0A> on the 82576 (=
for=0A> > example writing 32768 to EITR is the same as writing a=0A> 1). So=
 the=A0 minimum setting on the 82576 is around 125=0A> ints/second. The 825=
75 can accept =0A> > values up the 65535 before wrapping. =0A> Hmm, looking=
 at the table in 8.8.12 would suggest:=0A> Setting it to one sets a reserve=
d bit, but does not change=0A> the interval.=0A> Setting it to 2^15 should =
set the LLI_EN bit, but does not=0A> change in interval.=0A> =0A> Jack is s=
etting the register to=0A> igb_low_latency: 128=0A> igb_ave_latency: 450=0A=
> igb_bulk_latency: 1200=0A> =0A> This would result in intervals of:=0A> ig=
b_low_latency: 32=0A> igb_ave_latency: 112=0A> igb_bulk_latency: 300=0A> Ja=
ck: What are the corresponding interrupt rates? The spec=0A> provides diffe=
rent=0A> =A0 =A0 =A0 formulas and talks about a 1us, 2us or=0A> 8us counter=
. Not sure what is right...=0A> Jack: Why are you setting bit1 (which is re=
served) in the=0A> case igb_ave_latency?=0A> =0A> And another question for =
Jack:=0A> In igb_update_aim() you do=0A> =A0=A0=A0 if (olditr !=3D newitr) =
{=0A> =A0=A0=A0 =A0=A0=A0 /* Change interrupt=0A> rate */=0A> =A0=A0=A0 =A0=
=A0=A0 rxr->eitr_setting=0A> =3D newitr;=0A> =A0=A0=A0 =A0=A0=A0=0A> E1000_=
WRITE_REG(&adapter->hw,=0A> E1000_EITR(rxr->me),=0A> =A0=A0=A0 =A0=A0=A0 =
=A0 =A0 newitr=0A> | (newitr << 16));=0A> =A0=A0=A0 }=0A> So why are settin=
g the higher bits of the EITR? You are=0A> setting=0A> igb_low_latency: the=
 LL Counter becomes 0, the moderation=0A> counter becomes 16=0A> igb_ave_la=
tency: the LL Counter becomes 2, the moderation=0A> counter becomes 56=0A> =
igb_bulk_latency: the LL Counter becomes 16, the moderation=0A> counter bec=
omes 148=0A> =0A> I really do not understand these settings. Maybe the spec=
=0A> is wrong? Or you do mean=0A> =A0=A0=A0 if (olditr !=3D newitr) {=0A> =
=A0=A0=A0 =A0=A0=A0 /* Change interrupt=0A> rate */=0A> =A0=A0=A0 =A0=A0=A0=
 rxr->eitr_setting=0A> =3D newitr;=0A> =A0=A0=A0 =A0=A0=A0=0A> E1000_WRITE_=
REG(&adapter->hw, E1000_EITR(rxr->me),=0A> newitr);=0A> =A0=A0=A0 }=0A> Or =
do you want to preserve the counters, set the CNT_INGR=0A> bit and mean=0A>=
 =A0=A0=A0 if (olditr !=3D newitr) {=0A> =A0=A0=A0 =A0=A0=A0 /* Change inte=
rrupt=0A> rate */=0A> =A0=A0=A0 =A0=A0=A0 rxr->eitr_setting=0A> =3D newitr;=
=0A> =A0=A0=A0 =A0=A0=A0=0A> E1000_WRITE_REG(&adapter->hw, E1000_EITR(rxr->=
me),=0A> 0x80000000 | newitr);=0A> =A0=A0=A0 }=0A> =0A> Could you clarify t=
hat?=0A> > =0A> > The 82576 document doesn't have a map of the register=0A>=
 that I can find, so=0A> > Im curious as to whether these observations are=
=0A> something I can assume is=0A> > true across all parts and motherboards=
/cards, or is=0A> there some=0A> > implementation variance that will cause =
these to only=0A> apply to the ones=0A> > I happen to be testing?=0A> > =0A=
> > Thanks,=0A> > =0A> > Barney=0A=0AAh, the register map in the older spec=
 doesn't have the full or=0Acorrect information :\=0A=0ANote that ripping o=
ut intel's auto-moderation was one of my first tasks,=0Aso I can't comment =
on how how they derive those values (which are very different than in LINUX=
). Its supposed to be based on average packet=0Asize, so I'm not sure what =
Jack is doing with some of these settings.=0AAs for the EITR settings,the d=
atasheet is just plain wrong. For example in =0Asection 7.3.3.1 it says tha=
t a setting of 125d would result in 8000=0Ainterrupts per second; in practi=
ce that value results in about 32K=0Ainterrupts per second. The 82575 seems=
 to use the same algorithm=0Aas the em class devices:=0A=0A1,000,000,000 / =
(256 * ints_per_sec)=A0 =0A=0Aso "low-latency" is 30,517 interrupts per sec=
ond, while "bulk latency"=0Ais 3,255, which is way, way way too high.=A0 So=
 with 4 queues, you have a=0Aminimum of 13K interrupts per second. Its just=
 a concept that hasn't been=0A thought out or tested in practice. Its also =
absolutely ridiculous to=0A adjust the  moderation on every interrupt. You =
can't interpret traffic=0A patterns in 1/8000th of a second. Also, notice t=
hat the bulk threshold=0Ais 10,000 bytes, and the bulk setting is 3255 int/=
sec. Well in order to=0Areceive 10,000 bytes in 1/3255th of a second, which=
 is 260Kb/s,=0Awith 4 queues, assuming reasonable distribution, you'd have =
to be=0Areceiving at more than wire speed to stay in bulk for more than 1=
=0Ainterrupt time. And how could the settings be the same for 1 queue as=0A=
for 4 queues?=0A=0AI don't see any reference to what LLI Moderation Enable =
bit might do. It=0Adoesn't seem to do anything; setting it or not setting i=
t seems to =0Aresult in the same level of moderation.=0A=0ABarney=0A=0A=0A=
=0A=0A      



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?591563.1459.qm>