From owner-freebsd-net@FreeBSD.ORG Wed Jun 6 08:50:17 2007 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 213EE16A537 for ; Wed, 6 Jun 2007 08:50:17 +0000 (UTC) (envelope-from drs@rucus.ru.ac.za) Received: from f.mail.ru.ac.za (f.mail.ru.ac.za [146.231.129.38]) by mx1.freebsd.org (Postfix) with ESMTP id 7B77313C447 for ; Wed, 6 Jun 2007 08:50:16 +0000 (UTC) (envelope-from drs@rucus.ru.ac.za) Received: from damar.ru.ac.za ([146.231.89.6]:60886) by f.mail.ru.ac.za with esmtp (Exim 4.63 (FreeBSD)) (envelope-from ) id 1Hvq9Z-0000Bp-La for freebsd-net@freebsd.org; Wed, 06 Jun 2007 09:42:01 +0200 Received: from localhost (localhost [127.0.0.1]) by damar.ru.ac.za (Postfix) with ESMTP id 81E975D49 for ; Wed, 6 Jun 2007 09:42:01 +0200 (SAST) Received: from damar.ru.ac.za ([127.0.0.1]) by localhost (damar.ru.ac.za [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nJZL9T5nSDeC for ; Wed, 6 Jun 2007 09:42:00 +0200 (SAST) Received: by damar.ru.ac.za (Postfix, from userid 1001) id 812EA5CE7; Wed, 6 Jun 2007 09:42:00 +0200 (SAST) From: David =?utf-8?q?Sieb=C3=B6rger?= Date: Wed, 6 Jun 2007 09:41:59 +0200 User-Agent: KMail/1.9.6 Organization: RUCUS MIME-Version: 1.0 Content-Disposition: inline To: freebsd-net@freebsd.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <200706060941.59943.drs@rucus.ru.ac.za> X-Virus-Scanned: f.mail.ru.ac.za (146.231.129.38) Subject: bge interfaces: poor transmit performance? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2007 08:50:17 -0000 I'm experiencing a problem with BCM5721 bge interfaces, which seem to be=20 able to receive at almost 1Gbps but can only transmit at =3D< 560Mbps. I first noticed the problem on a firewall which routes between a number=20 of subnets, all connected as VLAN interfaces on bge0. In its case,=20 transmitted traffic and received traffic should be roughly equal (minus=20 those packets which the firewall drops) but I've seen that throughput=20 hits a limit at 560Mbps (measured from the interface stats using slurm). To investigate further, I've connected two of the same machines with a=20 flylead and run some iperf tests: Test: iperf -c172.30.3.x -w256k -t30 -P8 Client Server Result =2D------------------------------------------------------- 1. 6.2-STABLE 7.0-CURRENT 499 Mbits/sec 2. 7.0-CURRENT 6.2-STABLE 526 Mbits/sec 3. 6.2-STABLE Linux 500 Mbits/sec 4. Linux 6.2-STABLE 941 Mbits/sec 5. Linux Linux 941 Mbits/sec Test: iperf -c172.30.3.x -w256k -t30 -P4 -d Client Server Result =2D------------------------------------------------------- 6. 6.2-STABLE 7.0-CURRENT 381 & 388 Mbits/sec 7. 7.0-CURRENT 6.2-STABLE 369 & 405 Mbits/sec 8. 6.2-STABLE Linux 423 & 537 Mbits/sec 9. Linux 6.2-STABLE 421 & 554 Mbits/sec 10. Linux Linux 833 & 830 Mbits/sec (Hardware: Dell PE860, onboard BCM5721 NICs, 2.4 GHz Xeon 3060 CPU. =46reeBSD tuning: net.inet.ip.fw.enable=3D0, kern.ipc.maxsockbuf=3D8192000,= =20 net.inet.tcp.sendspace=3D262144, net.inet.tcp.recvspace=3D262144, WITNESS=20 and INVARIANTS disabled on -CURRENT. Linux: Knoppix 5.1.1 with kernel 2.6.19, using tg3 driver.) The most interesting result I see there is the difference between tests=20 3 and 4: just changing the direction of traffic flow makes a major=20 difference to performance. Possible causes that (I think) have been eliminated: * IRQ sharing. 'vmstat -i | grep bge' on all machines looks similar to=20 this: irq16: bge0 374558 5 irq17: bge1 324860 4 * Hardware architecture. If Linux can make it perform well, there's no=20 fundamental PCI bus bandwidth limitation or anything like that. * Network errors. 'netstat -i' has always shown Ierrs and Oerrs =3D 0. * SMP. I've tried building SMP and non-SMP kernels and got almost=20 idenitical results. * HZ. I've tried kernels with the default HZ=3D1000 and with HZ=3D2500. Does anyone have any ideas as to what could be causing the problem, or=20 any other tests I could try that might shed light on the problem? =2D-=20 David Sieb=C3=B6rger drs@rucus.ru.ac.za