Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Apr 2003 19:16:40 +0200
From:      Borje Josefsson <bj@dc.luth.se>
To:        Eric Anderson <anderson@centtech.com>
Cc:        David Gilbert <dgilbert@velocet.ca>
Subject:   Re: tcp_output starving -- is due to mbuf get delay? 
Message-ID:  <20030410171640.C44793B2@porter.dc.luth.se>
In-Reply-To: Your message of Thu, 10 Apr 2003 11:16:36 CDT. <3E9598E4.2000601@centtech.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 10 Apr 2003 11:16:36 CDT Eric Anderson wrote:

> Mike Silbersack wrote:
> >>My hosts are connected directly to core routers in a 10Gbps nationwid=
e
> >>network, so if anybody is interested in some testing I am more than
> >>willing to participate. If anybody produces a patch, I have a third s=
ystem
> >>that I can use for piloting of that too.
> >>
> >>--B=F6rje
> > =

> > =

> > This brings up something I've been wondering about, which you might w=
ant
> > to investigate:
> > =

> >>From tcp_output:
> > =

> > 		if (error =3D=3D ENOBUFS) {
> > 	                if (!callout_active(tp->tt_rexmt) &&
> >                             !callout_active(tp->tt_persist))
> > 	                        callout_reset(tp->tt_rexmt, tp->t_rxtcur,
> >                                       tcp_timer_rexmt, tp);
> > 			tcp_quench(tp->t_inpcb, 0);
> > 			return (0);
> > 		}
> > =

> > That tcp_quench knocks the window size back to one packet, if I'm not=

> > mistaken.  You might want to put a counter there and see if that's
> > happening frequently to you; if so, it might explain some loss of
> > performance.
> > =

> > Have you tried running kernel profiling yet?  It would be interesting=
 to
> > see which functions are using up the largest amount of time.

Could do that if I knew how... Not before the weekend though, right now =

I'm at the longue at the airport...
 =

> It's interesting - I'm only getting about 320mb/s.. I must be hitting a=
 =

> similar problem.  I'm not nearly as adept at hacking code to find bugs =

> though. :(

320 Mbit/sec seems familiar, this was what I got when I first tried on a =

system with "traditional" PCI bus. Changing the OS to NetBSD on that box =

bumped that to 525 Mbit/sec. You need wide PCI (or preferrably PCI-X for =

this).

What happens in that case for me is that I run out of CPU resources. Try =

running "top" in one window and "netstat 1" in another while bashing the =

net with ttcp.

If everything is OK (which it apparently isn't), top will show free CPU, =

and netstat should show a *very* steady packet flow (around 90kpps if You=
 =

have MTU 1500). Any packet loss is fatal for this speed, so if there is a=
 =

way (as indicated by Mike above) to not restarting with windowsize from =

scratch that will make recovery much better.

My test was done with ttcp and this parameters:

ttcp -s -t -f m -l 61440 -n 20345 dest.host

(tuned for a 10 sec test at 1Gbps).

IMPORTANT NOTE: Several tests here has shown that this is VERY BADLY =

affected if You have too much LAN equipment (especially VLAN seems to be =

harmful) at the edges. My speed of 960 Mbit/sec fell to 165 just by addin=
g =

10 feet of cable and two switches :-(

--B=F6rje



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030410171640.C44793B2>