From owner-freebsd-performance@FreeBSD.ORG Wed Mar 22 04:26:57 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 96A3116A401 for ; Wed, 22 Mar 2006 04:26:57 +0000 (UTC) (envelope-from g_jin@lbl.gov) Received: from smtp107.sbc.mail.mud.yahoo.com (smtp107.sbc.mail.mud.yahoo.com [68.142.198.206]) by mx1.FreeBSD.org (Postfix) with SMTP id 333B243D4C for ; Wed, 22 Mar 2006 04:26:57 +0000 (GMT) (envelope-from g_jin@lbl.gov) Received: (qmail 22472 invoked from network); 22 Mar 2006 04:26:56 -0000 Received: from unknown (HELO ?192.168.2.10?) (jinmtb@sbcglobal.net@68.127.178.44 with plain) by smtp107.sbc.mail.mud.yahoo.com with SMTP; 22 Mar 2006 04:26:56 -0000 Message-ID: <4420D25F.6050203@lbl.gov> Date: Tue, 21 Mar 2006 20:28:15 -0800 From: "Jin Guojun [VFFS]" User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.5) Gecko/20050108 X-Accept-Language: zh, zh-CN, en MIME-Version: 1.0 To: Gary Thorpe References: <1142966555.c7f9603cgthorpe@myrealbox.com> In-Reply-To: <1142966555.c7f9603cgthorpe@myrealbox.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Wed, 22 Mar 2006 04:45:06 +0000 Cc: freebsd-performance@freebsd.org, oxy@field.hu Subject: Re: packet drop with intel gigabit / marwell gigabit X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Mar 2006 04:26:57 -0000 You are fast away from the real world. This has been explained million times, just like I teach intern student every summer :-) First of all, DDR400 and 200 MHz bus mean nothing -- A DDR 266 + 500MHz CPU system can over perform a DDR 400 + 1.7 GHz CPU system. Another example: Ixxxx 2 CPU was designed with 3 level caches. Supposedly Level 1 to level2 takes 5 cycles Level 2 to level 3 takes 11 cycles What you expect CPU to memory time (cycles) -- CPU to level-1 is one cycle ? you would expect 17 cycles to 20 cycles of total. But it actually takes 210 cycles due to some design issues. Now your 1.6 GB/s reduced to 16MB/s or even worse just based on this factor. Number of other factors affect memory bandwidth, such as bus arbitration. Have you done any memory benchmark on a system before doing such simple calculation? Secondly, DMA moves data from NIC to mbuf, then who moves data from mbuf to user buffer? Not human. It is CPU. When DMA moving data, can CPU moves data simultaneously? DMA takes both I/O bandwidth and memory bandwidth. If your system has only 16 MB/s memory bandwidth, your network throughput is less 8 MB/s, typically below 6.4 MB/s. If you cannot move data fast enough away from NIC, what happens? packet loss! That is why his CPU utilization was low because there was no much data cross CPU. So, that is why I asked him what is the CPU utilization first, then the chipset. This is the basic steps to diagnose network performance. If you know a CPU and chipset for a system, you will know the network performance ceiling for that system, guaranteed. But it does not guarantee you can get that ceiling performance, especially over OC-12 (622 Mb/s) high-speed networks. That requires intensive tuning knowledge for current TCP stack, which is well explained on the Internet by searching for "TCP tuning". -Jin Gary Thorpe wrote: > I thought all modern NICs used bus mastering DMA i.e. not dependent on > CPU for data transfers? In addition, the available memory bandwidth > for modern CPU's/systems is well over 100 MB/s. DDR400 is 400 MB/s > (megabytes per second). Bus mastering DMA will be limited by the > memory or IO bus bandwidth primarily. The system bus bandwidth cannot > be the problem either: his motherboard's lowest front side bus speed > is 200 MHz * 64-bit width = 1.6 GB/s (gigabytes per second) of peak > system bus bandwidth. > >The limitation of 32-bit/33 MHz PCI is 133 MB/s (again, megabytes not bits) maximum. Gigabit ethernet requires 125 MB/s (not Mb/s) maximum bandwidth: 32/33 PCI has enough for bursts but bus contention with disk bandwidth will reduce the sustained bandwidth. The motherboard in question has an option for integrated gigabit LAN which may bypass the shared PCI bus altogether (or it might not). > >Anyway, the original problem was packet loss and not bandwidth. His CPU is mostly idle, so that cannot be the reason for packet loss. If 32/33 PCI can sustain 133 MB/s then it cannot be a problem because he needs >less than this. If it cannot, then packets will arrive too fast from the network before they can be moved from the board into memory and would cause the packet loss. Otherwise, his system is capable of achieving what he wants in theory and the suboptimal behavior may be due to hardware (e.g. PCI bus bandwidth not being able to reach 133 MB/s sustained) or software limitations (e.g inefficient operating system). > >