Date: Thu, 23 Mar 2006 00:40:17 -0500 From: "Gary Thorpe" <gthorpe@myrealbox.com> To: g_jin@lbl.gov Cc: freebsd-performance@freebsd.org, oxy@field.hu Subject: Re: packet drop with intel gigabit / marwell gigabit Message-ID: <1143092417.c7f62afcgthorpe@myrealbox.com>
next in thread | raw e-mail | index | archive | help
[No subject in first one, sorry for repost] =20 Jin Guojun [VFFS] wrote: > You are fast away from the real world. This has been explained million > times, just like > I teach intern student every summer :-) > > First of all, DDR400 and 200 MHz bus mean nothing -- A DDR 266 + 500MHz > CPU system > can over perform a DDR 400 + 1.7 GHz CPU system. Given the same chipset+motherboard, no. DDR400 has more bandwidth and a sma= ller latency. Given different chipsets/motherboards, this may be true. Ho= wever, one could also say with accuracy that a 500 Mhz processor can outp= erform the same family running at 1.7 GHz under some conditons but few pe= ople will run to buy 500 Mhz over 1.7 GHz for performance alone. Another example: > Ixxxx 2 CPU was designed with 3 level caches. Supposedly > Level 1 to level2 takes 5 cycles > Level 2 to level 3 takes 11 cycles > What you expect CPU to memory time (cycles) -- CPU to level-1 is one > cycle ? > you would expect 17 cycles to 20 cycles of total. But it actually > takes 210 cycles > due to some design issues. > Now your 1.6 GB/s reduced to 16MB/s or even worse just based on this > factor. 1.6 Gb/s =3D system bus bandwidth. Cache won't affect this bandwidth. DDR40= 0 has 400 MB/s: only attainable for long sequential accesses of either re= ad or write but not a mix of both. DMA should be able to get near this li= mit (long and sequential, read or write only per transfer). A NIC with bu= s mastering DMA should be able to effectively use the memory bandwidth. > Number of other factors affect memory bandwidth, such as bus arbitration. > Have you done any memory benchmark on a system before doing such simple > calculation? No, they are just theoretical values telling you the limits of performance.= I asume that a decent implementation can get 75% of the theoretical limi= t at least some of the time under good conditions (like DMA). > > Secondly, DMA moves data from NIC to mbuf, then who moves data from mbuf > to user buffer? > Not human. It is CPU. When DMA moving data, can CPU moves data > simultaneously? > DMA takes both I/O bandwidth and memory bandwidth. If your system has > only 16 MB/s > memory bandwidth, your network throughput is less 8 MB/s, typically > below 6.4 MB/s. > If you cannot move data fast enough away from NIC, what happens? > packet loss! True, but would this type of packet loss even be measured by the OS? Packet= loss to the OS means some packets were dropped from the software portion= of the network stack right? This means that the NIC has no problems deli= vering it to the OS and the OS has problems delivering it to the user pro= cess. You are arguing that the bandwidth is not sufficient for the processor to d= o this copy out (or page loan out =3D zero copy, only memory management t= ricks) and the software has to drop packets from mbufs when more packets = arrive for UDP. Enough bandwidth is theoretically available for this (muc= h more than required), it may or may not be true that the actual sustaine= d bandwidth is insufficient. I don't think that 1/4 of the bandwidth is a= ctually available for any reasonable (i.e. not junk) system. > > That is why his CPU utilization was low because there was no much data > cross CPU. > So, that is why I asked him what is the CPU utilization first, then the > chipset. This is > the basic steps to diagnose network performance. > If you know a CPU and chipset for a system, you will know the network > performance > ceiling for that system, guaranteed. But it does not guarantee you can > get that ceiling > performance, especially over OC-12 (622 Mb/s) high-speed networks. That > requires > intensive tuning knowledge for current TCP stack, which is well > explained on the Internet > by searching for "TCP tuning". In this case, bandwidth should not factor in (16 MB/s is low, disks can reg= ularly double this easily). The 1 Gb/s NIC is not being fully used in thi= s case (< 40 MB/s) and the processor is mostly idle.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1143092417.c7f62afcgthorpe>