From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 9 17:23:49 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9ECC937B401; Wed, 9 Apr 2003 17:23:49 -0700 (PDT) Received: from postal2.lbl.gov (postal2.lbl.gov [131.243.248.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id D8CF343FAF; Wed, 9 Apr 2003 17:23:48 -0700 (PDT) (envelope-from j_guojun@lbl.gov) Received: from postal2.lbl.gov (localhost [127.0.0.1]) by postal2.lbl.gov (8.12.8/8.12.8) with ESMTP id h3A0NkZ8016682; Wed, 9 Apr 2003 17:23:46 -0700 (PDT) Received: from lbl.gov (gracie.lbl.gov [131.243.2.175]) by postal2.lbl.gov (8.12.8/8.12.8) with ESMTP id h3A0NjIg016679; Wed, 9 Apr 2003 17:23:45 -0700 (PDT) Sender: jin@lbl.gov Message-ID: <3E94B993.D282DEB2@lbl.gov> Date: Wed, 09 Apr 2003 17:23:47 -0700 From: "Jin Guojun [DSD]" X-Mailer: Mozilla 4.76 [en] (X11; U; FreeBSD 4.7-RELEASE i386) X-Accept-Language: zh, zh-CN, en MIME-Version: 1.0 To: Sean Chittenden References: <3E94A22D.174321F0@lbl.gov> <20030409230733.GX79923@perrin.int.nxad.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.1 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 00:23:49 -0000 The interesting result -- In normal FreeBSD TCP stack, the large delay goes away, but more small delays. Apr 9 17:10:01 tcp_lion /kernel: sosend: td 23 3424 (bumped up from 1920) The performance dropped badly to 92 Mb/s (no loss) In my new TCP, I did not see any delay above 5 us, which is good, but the overall TCP performance drop to 120 Mb/s. So, maybe either there are a lot delay below 5 us and about 100 ns, or some other bottleneck is trigged somewhere. I guess there is more work to do to determine what is going on :-( I will post whatever I will discover. Thanks for pointing to the patch. -Jin Sean Chittenden wrote: > > When testing GigE path that has 67 ms RTT, the maximum TCP throughput is > > limited at 250 Mb/s. By tracing the problem, I found that tcp_output() is > > starving > > where snd_wnd and snd_cwnd are fully open. The snd_cc is never filled beyond > > the 4.05MB even though the snd_hiwat is 10MB and snd_sbmax is 8MB. That is, > > sosend never stopped at sbwait. So only place can slow down is the mbuf > > allocation > > in sosend(). The attached trace file shows that each MGET and MCLGET takes > > significant time -- around 8 us at slow start time, and gradually increasing > > after that > > in an range 18 to 648. > > Each packet Tx on GigE takes 12 us. It average mbuf allocation takes 18 us, then > > > > the performance will be reduced to 40%, in fact it is down to 25%, which means > > higher average delay. > > > > I have change NMBCLUSTER from 2446 to 6566 to 10240, and nothing is improved. > > > > Any one can tell what factors would cause MGET / MCLGET to wait? > > Is there any way to make MGET/MCLGET not to wait? > > Luigi posted a patch about this a while back (last summer sometime, > iirc). > > http://people.freebsd.org/~seanc/patches/#o1_mbuf_lookup > > I updated his patch but haven't had a chance to test it. If you're > feeling brave, see if applying this patch fixes this bottle neck. -sc > > -- > Sean Chittenden -- ------------ Jin Guojun ----------- v --- j_guojun@lbl.gov --- Distributed Systems Department http://www.itg.lbl.gov/~jin M/S 50B-2239 Ph#:(510) 486-7531 Fax: 486-6363 Lawrence Berkeley National Laboratory, Berkeley, CA 94720