From owner-freebsd-current@FreeBSD.ORG Mon Nov 29 15:15:47 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7462216A4CF for ; Mon, 29 Nov 2004 15:15:47 +0000 (GMT) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6290643D58 for ; Mon, 29 Nov 2004 15:15:46 +0000 (GMT) (envelope-from andre@freebsd.org) Received: (qmail 2223 invoked from network); 29 Nov 2004 15:07:20 -0000 Received: from unknown (HELO freebsd.org) ([62.48.0.53]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 29 Nov 2004 15:07:20 -0000 Message-ID: <41AB3D24.80DEBD08@freebsd.org> Date: Mon, 29 Nov 2004 16:15:48 +0100 From: Andre Oppermann X-Mailer: Mozilla 4.8 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: "David G. Lawrence" References: <41A628F3.3000309@freebsd.org> <20041125205210.GA673@opteron.dglawrence.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit cc: Jeremie Le Hen cc: freebsd-current@freebsd.org cc: Robert Watson cc: freebsd-stable@freebsd.org Subject: Re: serious networking (em) performance (ggate and NFS) problem X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Nov 2004 15:15:47 -0000 "David G. Lawrence" wrote: > > > >>tests. With the re driver, no change except placing a 100BT setup with > > >>no packet loss to a gigE setup (both linksys switches) will cause > > >>serious packet loss at 20Mbps data rates. I have discovered the only > > >>way to get good performance with no packet loss was to > > >> > > >>1) Remove interrupt moderation > > >>2) defrag each mbuf that comes in to the driver. > > > > > >Sounds like you're bumping into a queue limit that is made worse by > > >interrupting less frequently, resulting in bursts of packets that are > > >relatively large, rather than a trickle of packets at a higher rate. > > >Perhaps a limit on the number of outstanding descriptors in the driver or > > >hardware and/or a limit in the netisr/ifqueue queue depth. You might try > > >changing the default IFQ_MAXLEN from 50 to 128 to increase the size of the > > >ifnet and netisr queues. You could also try setting net.isr.enable=1 to > > >enable direct dispatch, which in the in-bound direction would reduce the > > >number of context switches and queueing. It sounds like the device driver > > >has a limit of 256 receive and transmit descriptors, which one supposes is > > >probably derived from the hardware limit, but I have no documentation on > > >hand so can't confirm that. > > > > > >It would be interesting on the send and receive sides to inspect the > > >counters for drops at various points in the network stack; i.e., are we > > >dropping packets at the ifq handoff because we're overfilling the > > >descriptors in the driver, are packets dropped on the inbound path going > > >into the netisr due to over-filling before the netisr is scheduled, etc. > > >And, it's probably interesting to look at stats on filling the socket > > >buffers for the same reason: if bursts of packets come up the stack, the > > >socket buffers could well be being over-filled before the user thread can > > >run. > > > > I think it's the tcp_output() path that overflows the transmit side of > > the card. I take that from the better numbers when he defrags the packets. > > Once I catch up with my mails I start to put up the code I wrote over the > > last two weeks. :-) You can call me Mr. TCP now. ;-) > > He was doing his test with NFS over TCP, right? ...That would be a single > connection, so how is it possible to 'overflow the transmit side of the > card'? The TCP window size will prevent more than 64KB to be outstanding. > Assuming standard size ethernet frames, that would be a maximum of 45 packets > in-flight at any time (65536/1460=45), well below the 256 available transmit > descriptors. > It is also worth pointing out that 45 full-size packets is 540us at > gig-e speeds. Even when you add up typical switch latencies and interrupt > overhead and coalesing on both sides, it's hard to imagine that the window > size (bandwidth * delay) would be a significant limiting factor across a > gig-e LAN. For some reason he is getting long mbuf chains and that is why a call to m_defrag helps. With long mbuf chains you can easily overflow the transmit descriptors. > I too am seeing low NFS performance (both TCP and UDP) with non-SMP > 5.3, but on the same systems I can measure raw TCP performance (using > ttcp) of >850Mbps. It looks to me like there is something wrong with > NFS, perhaps caused by delays with scheduling nfsd? -- Andre