Date: Thu, 10 Apr 2003 19:16:40 +0200 From: Borje Josefsson <bj@dc.luth.se> To: Eric Anderson <anderson@centtech.com> Cc: David Gilbert <dgilbert@velocet.ca> Subject: Re: tcp_output starving -- is due to mbuf get delay? Message-ID: <20030410171640.C44793B2@porter.dc.luth.se> In-Reply-To: Your message of Thu, 10 Apr 2003 11:16:36 CDT. <3E9598E4.2000601@centtech.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 10 Apr 2003 11:16:36 CDT Eric Anderson wrote: > Mike Silbersack wrote: > >>My hosts are connected directly to core routers in a 10Gbps nationwid= e > >>network, so if anybody is interested in some testing I am more than > >>willing to participate. If anybody produces a patch, I have a third s= ystem > >>that I can use for piloting of that too. > >> > >>--B=F6rje > > = > > = > > This brings up something I've been wondering about, which you might w= ant > > to investigate: > > = > >>From tcp_output: > > = > > if (error =3D=3D ENOBUFS) { > > if (!callout_active(tp->tt_rexmt) && > > !callout_active(tp->tt_persist)) > > callout_reset(tp->tt_rexmt, tp->t_rxtcur, > > tcp_timer_rexmt, tp); > > tcp_quench(tp->t_inpcb, 0); > > return (0); > > } > > = > > That tcp_quench knocks the window size back to one packet, if I'm not= > > mistaken. You might want to put a counter there and see if that's > > happening frequently to you; if so, it might explain some loss of > > performance. > > = > > Have you tried running kernel profiling yet? It would be interesting= to > > see which functions are using up the largest amount of time. Could do that if I knew how... Not before the weekend though, right now = I'm at the longue at the airport... = > It's interesting - I'm only getting about 320mb/s.. I must be hitting a= = > similar problem. I'm not nearly as adept at hacking code to find bugs = > though. :( 320 Mbit/sec seems familiar, this was what I got when I first tried on a = system with "traditional" PCI bus. Changing the OS to NetBSD on that box = bumped that to 525 Mbit/sec. You need wide PCI (or preferrably PCI-X for = this). What happens in that case for me is that I run out of CPU resources. Try = running "top" in one window and "netstat 1" in another while bashing the = net with ttcp. If everything is OK (which it apparently isn't), top will show free CPU, = and netstat should show a *very* steady packet flow (around 90kpps if You= = have MTU 1500). Any packet loss is fatal for this speed, so if there is a= = way (as indicated by Mike above) to not restarting with windowsize from = scratch that will make recovery much better. My test was done with ttcp and this parameters: ttcp -s -t -f m -l 61440 -n 20345 dest.host (tuned for a 10 sec test at 1Gbps). IMPORTANT NOTE: Several tests here has shown that this is VERY BADLY = affected if You have too much LAN equipment (especially VLAN seems to be = harmful) at the edges. My speed of 960 Mbit/sec fell to 165 just by addin= g = 10 feet of cable and two switches :-( --B=F6rje
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030410171640.C44793B2>