From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 10 10:16:48 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1928E37B401; Thu, 10 Apr 2003 10:16:48 -0700 (PDT) Received: from porter.dc.luth.se (host-n12-30.homerun.telia.com [212.181.227.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4C28843FA3; Thu, 10 Apr 2003 10:16:46 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from porter.dc.luth.se (localhost.dc.luth.se [127.0.0.1]) by porter.dc.luth.se (Postfix) with ESMTP id C44793B2; Thu, 10 Apr 2003 19:16:40 +0200 (CEST) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Eric Anderson In-reply-to: Your message of Thu, 10 Apr 2003 11:16:36 CDT. <3E9598E4.2000601@centtech.com> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Thu, 10 Apr 2003 19:16:40 +0200 From: Borje Josefsson Message-Id: <20030410171640.C44793B2@porter.dc.luth.se> cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: "Jin Guojun \[DSD\]" cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 17:16:48 -0000 On Thu, 10 Apr 2003 11:16:36 CDT Eric Anderson wrote: > Mike Silbersack wrote: > >>My hosts are connected directly to core routers in a 10Gbps nationwid= e > >>network, so if anybody is interested in some testing I am more than > >>willing to participate. If anybody produces a patch, I have a third s= ystem > >>that I can use for piloting of that too. > >> > >>--B=F6rje > > = > > = > > This brings up something I've been wondering about, which you might w= ant > > to investigate: > > = > >>From tcp_output: > > = > > if (error =3D=3D ENOBUFS) { > > if (!callout_active(tp->tt_rexmt) && > > !callout_active(tp->tt_persist)) > > callout_reset(tp->tt_rexmt, tp->t_rxtcur, > > tcp_timer_rexmt, tp); > > tcp_quench(tp->t_inpcb, 0); > > return (0); > > } > > = > > That tcp_quench knocks the window size back to one packet, if I'm not= > > mistaken. You might want to put a counter there and see if that's > > happening frequently to you; if so, it might explain some loss of > > performance. > > = > > Have you tried running kernel profiling yet? It would be interesting= to > > see which functions are using up the largest amount of time. Could do that if I knew how... Not before the weekend though, right now = I'm at the longue at the airport... = > It's interesting - I'm only getting about 320mb/s.. I must be hitting a= = > similar problem. I'm not nearly as adept at hacking code to find bugs = > though. :( 320 Mbit/sec seems familiar, this was what I got when I first tried on a = system with "traditional" PCI bus. Changing the OS to NetBSD on that box = bumped that to 525 Mbit/sec. You need wide PCI (or preferrably PCI-X for = this). What happens in that case for me is that I run out of CPU resources. Try = running "top" in one window and "netstat 1" in another while bashing the = net with ttcp. If everything is OK (which it apparently isn't), top will show free CPU, = and netstat should show a *very* steady packet flow (around 90kpps if You= = have MTU 1500). Any packet loss is fatal for this speed, so if there is a= = way (as indicated by Mike above) to not restarting with windowsize from = scratch that will make recovery much better. My test was done with ttcp and this parameters: ttcp -s -t -f m -l 61440 -n 20345 dest.host (tuned for a 10 sec test at 1Gbps). IMPORTANT NOTE: Several tests here has shown that this is VERY BADLY = affected if You have too much LAN equipment (especially VLAN seems to be = harmful) at the edges. My speed of 960 Mbit/sec fell to 165 just by addin= g = 10 feet of cable and two switches :-( --B=F6rje