Date: Fri, 19 Nov 2004 13:18:40 +0100 From: Emanuel Strobl <Emanuel.Strobl@gmx.net> To: freebsd-current@freebsd.org Cc: Robert Watson <rwatson@freebsd.org> Subject: Re: serious networking (em) performance (ggate and NFS) problem Message-ID: <200411191318.46405.Emanuel.Strobl@gmx.net> In-Reply-To: <Pine.NEB.3.96L.1041118121834.66045B-100000@fledge.watson.org> References: <Pine.NEB.3.96L.1041118121834.66045B-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart1648706.zu1aVHYG7D Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Am Donnerstag, 18. November 2004 13:27 schrieb Robert Watson: > On Wed, 17 Nov 2004, Emanuel Strobl wrote: > > I really love 5.3 in many ways but here're some unbelievable transfer =46irst, thanks a lot to all of you paying attention to my problem again. I'll use this as a cumulative answer to the many postings of you, first=20 answering Roberts questions and at the bottom those of the others. I changed cables and couldn't reproduce that bad results so I changed cable= s=20 back but also cannot reproduce them, especially the ggate write, formerly=20 with 2,6MB/s now performs at 15MB/s, but I haven't done any polling tests=20 anymore, just interrupt driven, since Matt explained that em doesn't benefi= t=20 of polling in any way. Results don't indicate a serious problem now but are still about a third of= =20 what I'd expected with my hardware. Do I really need Gigahertz Class CPUs t= o=20 transfer 30MB/s over GbE? > > I think the first thing you want to do is to try and determine whether the > problem is a link layer problem, network layer problem, or application > (file sharing) layer problem. Here's where I'd start looking: > > (1) I'd first off check that there wasn't a serious interrupt problem on > the box, which is often triggered by ACPI problems. Get the box to be > as idle as possible, and then use vmstat -i or stat -vmstat to see if > anything is spewing interrupts. Everything is fine > > (2) Confirm that your hardware is capable of the desired rates: typically > this involves looking at whether you have a decent card (most if_em > cards are decent), whether it's 32-bit or 64-bit PCI, and so on. For > unidirectional send on 32-bit PCI, be aware that it is not possible to > achieve gigabit performance because the PCI bus isn't fast enough, for > example. I'm aware that my 32bit/33MHz PCI bus is a "bottleneck", but I saw almost=20 80MByte/s running over the bus to my test-stripe-set (over the HPT372). So= =20 I'm pretty sure the system is good for 40MB/s ober the GbE line, which was= =20 sufficient for me. > > (3) Next, I'd use a tool like netperf (see ports collection) to establish > three characteristics: round trip latency from user space to user > space (UDP_RR), TCP throughput (TCP_STREAM), and large packet > throughput (UDP_STREAM). With decent boxes on 5.3, you should have no > trouble at all maxing out a single gig-e with if_em, assuming all is > working well hardware wise and there's no software problem specific to > your configuration. Please find the results on http://www.schmalzbauer.de/document.php?id=3D21 There is also a lot of additional information and more test results > > (4) Note that router latency (and even switch latency) can have a > substantial impact on gigabit performance, even with no packet loss, > in part due to stuff like ethernet flow control. You may want to put > the two boxes back-to-back for testing purposes. > I was aware of that and because of lacking a GbE switch anyway I decided to= =20 use a simple cable ;) > (5) Next, I'd measure CPU consumption on the end box -- in particular, use > top -S and systat -vmstat 1 to compare the idle condition of the > system and the system under load. > I additionally added these values to the netperf results. > If you determine there is a link layer or IP layer problem, we can start > digging into things like the error statistics in the card, negotiation > issues, etc. If not, you want to move up the stack to try and > characterize where it is you're hitting the performance issue. Am Donnerstag, 18. November 2004 17:53 schrieb M. Warner Losh: > In message: <Pine.NEB.3.96L.1041118121834.66045B-100000@fledge.watson.org> > > =A0 =A0 =A0 =A0 =A0 =A0 Robert Watson <rwatson@freebsd.org> writes: > : (1) I'd first off check that there wasn't a serious interrupt problem on > : =A0 =A0 the box, which is often triggered by ACPI problems. =A0Get the = box to > : be as idle as possible, and then use vmstat -i or stat -vmstat to see if > : anything is spewing interrupts. > > Also, make sure that you aren't sharing interrupts between > GIANT-LOCKED and non-giant-locked cards. =A0This might be exposing bugs > in the network layer that debug.mpsafenet=3D0 might correct. =A0Just > noticed that our setup here has that setup, so I'll be looking into > that area of things. As you can see on the link above, no shared IRQs --nextPart1648706.zu1aVHYG7D Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (FreeBSD) iD8DBQBBneSmBylq0S4AzzwRAkDcAJ40RoyPKUrK+40jHAcTfNqoj+mHvgCfSeNs f4mOm2aRdKjE2yN6spBoFJU= =R0oN -----END PGP SIGNATURE----- --nextPart1648706.zu1aVHYG7D--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200411191318.46405.Emanuel.Strobl>