Date: Thu, 10 Apr 2003 19:16:40 +0200 From: Borje Josefsson <bj@dc.luth.se> To: Eric Anderson <anderson@centtech.com> Cc: David Gilbert <dgilbert@velocet.ca> Subject: Re: tcp_output starving -- is due to mbuf get delay? Message-ID: <20030410171640.C44793B2@porter.dc.luth.se> In-Reply-To: Your message of Thu, 10 Apr 2003 11:16:36 CDT. <3E9598E4.2000601@centtech.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 10 Apr 2003 11:16:36 CDT Eric Anderson wrote:
> Mike Silbersack wrote:
> >>My hosts are connected directly to core routers in a 10Gbps nationwid=
e
> >>network, so if anybody is interested in some testing I am more than
> >>willing to participate. If anybody produces a patch, I have a third s=
ystem
> >>that I can use for piloting of that too.
> >>
> >>--B=F6rje
> > =
> > =
> > This brings up something I've been wondering about, which you might w=
ant
> > to investigate:
> > =
> >>From tcp_output:
> > =
> > if (error =3D=3D ENOBUFS) {
> > if (!callout_active(tp->tt_rexmt) &&
> > !callout_active(tp->tt_persist))
> > callout_reset(tp->tt_rexmt, tp->t_rxtcur,
> > tcp_timer_rexmt, tp);
> > tcp_quench(tp->t_inpcb, 0);
> > return (0);
> > }
> > =
> > That tcp_quench knocks the window size back to one packet, if I'm not=
> > mistaken. You might want to put a counter there and see if that's
> > happening frequently to you; if so, it might explain some loss of
> > performance.
> > =
> > Have you tried running kernel profiling yet? It would be interesting=
to
> > see which functions are using up the largest amount of time.
Could do that if I knew how... Not before the weekend though, right now =
I'm at the longue at the airport...
=
> It's interesting - I'm only getting about 320mb/s.. I must be hitting a=
=
> similar problem. I'm not nearly as adept at hacking code to find bugs =
> though. :(
320 Mbit/sec seems familiar, this was what I got when I first tried on a =
system with "traditional" PCI bus. Changing the OS to NetBSD on that box =
bumped that to 525 Mbit/sec. You need wide PCI (or preferrably PCI-X for =
this).
What happens in that case for me is that I run out of CPU resources. Try =
running "top" in one window and "netstat 1" in another while bashing the =
net with ttcp.
If everything is OK (which it apparently isn't), top will show free CPU, =
and netstat should show a *very* steady packet flow (around 90kpps if You=
=
have MTU 1500). Any packet loss is fatal for this speed, so if there is a=
=
way (as indicated by Mike above) to not restarting with windowsize from =
scratch that will make recovery much better.
My test was done with ttcp and this parameters:
ttcp -s -t -f m -l 61440 -n 20345 dest.host
(tuned for a 10 sec test at 1Gbps).
IMPORTANT NOTE: Several tests here has shown that this is VERY BADLY =
affected if You have too much LAN equipment (especially VLAN seems to be =
harmful) at the edges. My speed of 960 Mbit/sec fell to 165 just by addin=
g =
10 feet of cable and two switches :-(
--B=F6rje
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030410171640.C44793B2>
