From owner-freebsd-current@FreeBSD.ORG Mon Nov 22 00:38:12 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7426D16A4CE; Mon, 22 Nov 2004 00:38:12 +0000 (GMT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2772143D2F; Mon, 22 Nov 2004 00:38:12 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) iAM0c8cW052590; Sun, 21 Nov 2004 16:38:08 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id iAM0c7JQ052589; Sun, 21 Nov 2004 16:38:07 -0800 (PST) (envelope-from dillon) Date: Sun, 21 Nov 2004 16:38:07 -0800 (PST) From: Matthew Dillon Message-Id: <200411220038.iAM0c7JQ052589@apollo.backplane.com> To: Shunsuke SHINOMIYA References: <20041119185315.C43D.SHINO@fornext.org> <20041121205158.45CE.SHINO@fornext.org> cc: freebsd-stable@freebsd.org cc: freebsd-current@freebsd.org cc: Jeremie Le Hen Subject: Re: Re[2]: serious networking (em) performance (ggate and NFS) problem X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Nov 2004 00:38:12 -0000 : I did simple benchmark at some settings. : : I used two boxes which are single Xeon 2.4GHz with on-boarded em. : I measured a TCP throughput by iperf. : : These results show that the throughput of TCP increased if Interrupt :Moderation is turned OFF. At least, adjusting these parameters affected :TCP performance. Other appropriate combination of parameter may exist. Very interesting, but the only reason you get lower results is simply because the TCP window is not big enough. That's it. 8000 ints/sec = ~15KB of backlogged traffic. x 2 (sender, receiver) Multiply by two (both the sender's reception of acks and the receiver's reception of data) and you get ~30KB. This is awefully close to the default 32.5KB window size that iperf uses. Other then window sizing issues I can think of no rational reason why throughput would be lower. Can you? And, in fact, when I do the same tests on DragonFly and play with the interrupt throttle rate I get nearly the results I expect. * Shuttle Athlon 64 3200+ box, EM card in 32 bit PCI slot * 2 machines connected through a GiGE switch * All other hw.em0 delays set to 0 on both sides * throttle settings set on both sides * -w option set on iperf client AND server for 63.5KB window * software interrupt throttling has been turned off for these tests throttle result result freq (32.5KB win) (63.5KB win) (default) -------- -------------- ----------- maxrate 481 MBit/s 533 MBit/s (not sure what's going on here) 120000 518 MBit/s 558 MBit/s (not sure what's going on here) 100000 613 MBit/s 667 MBit/s (not sure what's going on here) 70000 679 MBit/s 691 MBit/s 60000 668 MBit/s 694 MBit/s 50000 678 MBit/s 684 MBit/s 40000 694 MBit/s 696 MBit/s 30000 694 MBit/s 696 MBit/s 20000 698 MBit/s 703 MBit/s 10000 707 MBit/s 716 MBit/s 9000 708 MBit/s 716 MBit/s 8000 710 MBit/s 717 MBit/s <--- drop off pt 32.5KB win 7000 683 MBit/s 716 MBit/s 6000 680 MBit/s 720 MBit/s 5000 652 MBit/s 718 MBit/s <--- drop off pt 63.5KB win 4000 555 Mbit/s 695 MBit/s 3000 522 MBit/s 533 MBit/s <--- GiGE throttling likely 2000 449 MBit/s 384 MBit/s (256 ring descriptors = 1000 260 MBit/s 193 MBit/s 2500 hz minimum) Unless you are in a situation where you need to route small packets flying around a cluster where low latency is important, it doesn't really make any sense to turn off interrupt throttling. It might make sense to change the default from 8000 to 10000 to handle typical default TCP window sizes (at least in a LAN situation), but it certainly should not be turned off. I got some weird results when I increased the frequency past 100KHz, and when I turned throttling off entirely. I'm not sure why. Maybe setting the ITR register to 0 is a bad idea. If I set it to 1 (i.e. 3906250 Hz) then I get 625 MBit/s. Setting the ITR to 1 (i.e. 256ns delay) should amount to the same thing as setting it to 0 but it doesn't. Very odd. The maximum interrupt rate as reported by systat is only ~46000 ints/sec so all the values above 50KHz should read about the same... and they do until we hit around 100Khz (10uS delay). Then everything goes to hell in a handbasket. Conclusion: 10000 hz would probably be a better default then 8000 hz. -Matt Matthew Dillon