From owner-freebsd-net@FreeBSD.ORG Sun Jan 21 07:09:45 2007 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6365916A401 for ; Sun, 21 Jan 2007 07:09:45 +0000 (UTC) (envelope-from max@love2party.net) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.179]) by mx1.freebsd.org (Postfix) with ESMTP id D7C9C13C428 for ; Sun, 21 Jan 2007 07:09:44 +0000 (UTC) (envelope-from max@love2party.net) Received: from [88.66.6.102] (helo=amd64.laiers.local) by mrelayeu.kundenserver.de (node=mrelayeu6) with ESMTP (Nemesis), id 0ML29c-1H8Wpb2P2E-0006eY; Sun, 21 Jan 2007 08:09:39 +0100 From: Max Laier Organization: FreeBSD To: freebsd-net@freebsd.org Date: Sun, 21 Jan 2007 08:09:19 +0100 User-Agent: KMail/1.9.5 References: <20070121155510.C23922@delplex.bde.org> In-Reply-To: <20070121155510.C23922@delplex.bde.org> X-Face: ,,8R(x[kmU]tKN@>gtH1yQE4aslGdu+2]; R]*pL,U>^H?)gW@49@wdJ`H<=?utf-8?q?=25=7D*=5FBD=0A=09U=5For=3D=5CmOZf764=26nYj=3DJYbR1PW0ud?=>|!~,,CPC.1-D$FG@0h3#'5"k{V]a~.<=?utf-8?q?mZ=7D44=23Se=7Em=0A=09Fe=7E=5C=5DX5B=5D=5Fxj?=(ykz9QKMw_l0C2AQ]}Ym8)fU MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart16990224.mgHh93Ab6g"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200701210809.27770.max@love2party.net> X-Provags-ID: kundenserver.de abuse@kundenserver.de login:61c499deaeeba3ba5be80f48ecc83056 Cc: Subject: Re: slow writes on nfs with bge devices X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Jan 2007 07:09:45 -0000 --nextPart16990224.mgHh93Ab6g Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Sunday 21 January 2007 07:25, Bruce Evans wrote: > nfs writes much less well with bge NICs than with other NICs (sk, fxp, Do you use hardware checksumming on the bge? There is an XXX in=20 bge_start_locked() that looks a bit suspicious to me. > xl, even rl). Sometimes writing a 20K source file from vi seems to > take about 2 seconds instead of seeming to be instantaneous (this gets > faster as the system warms up). Iozone shows the problem more > reproducibly. E.g.: > > 100Mbps fxp server -> 1Gbps bge 5701 client, udp: > %%% > IOZONE: Performance Test of Sequential File I/O -- V1.16 (10/28/92) > By Bill Norcott > > Operating System: FreeBSD -- using fsync() > > IOZONE: auto-test mode > > MB reclen bytes/sec written bytes/sec read > 1 512 1516885 291918639 > 1 1024 1158783 491354263 > 1 2048 1573651 715694105 > 1 4096 1223692 917431957 > 1 8192 729513 1097929467 > 2 512 1694809 281196631 > 2 1024 1379228 507917189 > 2 2048 1659521 789608264 > 2 4096 4606056 1064567574 > 2 8192 1142288 1318131028 > 4 512 1242214 298269971 > 4 1024 1853545 492110628 > 4 2048 2120136 742888430 > 4 4096 1896792 1121799065 > 4 8192 850210 1441812403 > 8 512 1563847 281422325 > 8 1024 1480844 492749552 > 8 2048 1658649 850165954 > 8 4096 2105283 1211348180 > 8 8192 2098425 1554875506 > 16 512 1508821 296842294 > 16 1024 1966239 527850530 > 16 2048 2036609 842656736 > 16 4096 1666138 1200594889 > 16 8192 2293378 1620824908 > Completed series of tests > %%% > > Here bge barely reaches 10Mbps speeds (~1.2 MB/S) for writing. Reading > is cached well and fast. 100Mbps xl on the same client with the same > server goes at full 100Mbps speed (11.77 MB/S for all file sizes > including larger ones since the disk is not the limit at 100Mbps). > 1Gbps sk on a different client with the same server goes at full > 100Nbps speed. > > Switching to tcp gives full 100 Mbps speed. However, when the bge link > speed is reduced to 100Mbps, udp becomes about 10 times slower than the > above and tcp becomes about as slow as the above (maybe a bit faster, > but far below 11.77 MB/S). > > bge is also slow at nfs serving: > > 1Gbps bge 5701 server -> 1Gbps sk client: > %%% > > IOZONE: Performance Test of Sequential File I/O -- V1.16 (10/28/92) > By Bill Norcott > > Operating System: FreeBSD -- using fsync() > > IOZONE: auto-test mode > > MB reclen bytes/sec written bytes/sec read > 1 512 36255350 242114472 > 1 1024 3051699 413319147 > 1 2048 22406458 632021710 > 1 4096 22447700 851162198 > 1 8192 3522493 1047562648 > 2 512 3270779 48125247 > 2 1024 28992179 46693718 > 2 2048 5956380 753318255 > 2 4096 27616650 1053311658 > 2 8192 5573338 48290208 > 4 512 9004770 47435659 > 4 1024 9576276 45601645 > 4 2048 30348874 85116667 > 4 4096 8635673 86150049 > 4 8192 9356773 47100031 > 8 512 9762446 46424146 > 8 1024 10054027 58344604 > 8 2048 9197430 60253061 > 8 4096 15934077 59476759 > 8 8192 8765470 47647937 > 16 512 5670225 46239891 > 16 1024 9425169 45950990 > 16 2048 9833515 46242945 > 16 4096 14812057 51313693 > 16 8192 9203742 47648722 > Completed series of tests > %%% > > Now the available bandwidth is 10 times larger and about 9/10 of it is > still not used, with a high variance. For larger files, the variance > is lower and the average speed is about 10MB/S. The disk can only do > about 40MB/S and the slowest of the 1Gbps NICS (sk) can only sustain > 80MB/S through udp and about 50MB/S through tcp (it is limited by the > 33 MHz 32-bit PCI bus and by being less smart than the bge interface). > When the bge NIC was on the system which is now the server with the fxp > NIC, bge and nfs worked unsurprisingly, just slower than I would have > liked. The write speed was 20-30MB/S for large files and 30-40MB/S for > medium-sized files, with low variance. This is the only configuration > in which nfs/bge worked as expected. > > The problem is very old and not very hardware dependent. Similar > behaviour happens when some of the following are changed: > > OS -> FreeBSD-~5.2 or FreeBSD-6 > hardware -> newer amd64 CPU (Turion X2) with 5705 (iozone output for > this below) instead of old amd64 CPU with 5701. The newer amd64 > normally runs an i386-SMP current kernel while the old amd64 was > running an amd64-UP current kernel in the above tests, but normally > runs ~5.2 amd64-UP and behaves similarly with that. The combination > that seemed to work right was an AthlonXP for the server with the same > 5701 and any kernel. The only strangeness with that was that current > kernels gave a 5-10% slower nfs server despite giving a 30-90% larger > packet rate for small packets. > > IOZONE: Performance Test of Sequential File I/O -- V1.16 (10/28/92) > By Bill Norcott > > Operating System: FreeBSD -- using fsync() > > 100Mbps fxp server -> 1Gbps bge 5705 client: > %%% > IOZONE: auto-test mode > > MB reclen bytes/sec written bytes/sec read > 1 512 2994400 185462027 > 1 1024 3074084 337817536 > 1 2048 2991691 576792985 > 1 4096 3074759 884740798 > 1 8192 3078019 1176892296 > 2 512 4262096 186709962 > 2 1024 2994468 339893080 > 2 2048 5112176 584846610 > 2 4096 4754187 909815165 > 2 8192 5100574 1212919611 > 4 512 5298715 187129017 > 4 1024 5302620 344445041 > 4 2048 4985597 590579630 > 4 4096 3703618 927711124 > 4 8192 5236177 1240896243 > 8 512 5142274 186899396 > 8 1024 6207933 345564808 > 8 2048 6162773 593088329 > 8 4096 6031445 936751120 > 8 8192 6072523 1224102288 > 16 512 5427113 186797193 > 16 1024 5065901 345544445 > 16 2048 5462338 595487384 > 16 4096 5256552 937013065 > 16 8192 5097101 1226320870 > Completed series of tests > %%% > > rl on a system with 1/20 as much CPU is faster than this. > > The problem doesn't seem to affect much besides writes on nfs. The > bge 5701 works very well for most things. It has a much better bus > interface than the 5705 and works even better after moving it to the > old amd64 system (it can now saturate 1Gbps where on the AthlonXP it > only got 3/4 of the way, while the 5705 only gets 1/4 of the way). > I've been working on minimising network latency and maximising packet > rate, and normally have very low network latency (60-80 uS for ping) > and fairly high packet rates. The changes for this are not the caause > of the bug :-), since the behaviour is not affected by running kernels > without these changes or by sysctl''ing the changes to be null.=20 > However, the problem looks like ones caused by large latencies combined > with non-streaming protocols. To write at just 11.77 MB/S, at least > 8000 packets/second must be set from the client to the server. Working > clients sustain this rate, but broken clients the rate is much lower > and not sustained: > > Output from netstat -s 1 on server while writing a ~1GB file via > 5701/udp: %%% > input (Total) output > packets errs bytes packets errs bytes colls > 900 0 1513334 142 0 33532 0 > 1509 0 2564836 236 0 57368 0 > 1647 0 2295802 259 0 51106 0 > 1603 0 1502736 252 0 32926 0 > 1055 0 637014 163 0 13938 0 > 558 0 1542510 86 0 34340 0 > 984 0 989854 155 0 21816 0 > 864 0 1320786 135 0 38152 0 > 883 0 1558060 165 0 34340 0 > 1177 0 3780102 203 0 85850 0 > 2087 0 954212 331 0 21210 0 > 1187 0 1413568 190 0 31310 0 > 650 0 3320604 101 0 75346 0 > 1565 0 1706542 246 0 37976 0 > 2055 0 2360620 329 0 52318 0 > 1554 0 2416996 244 0 54226 0 > 1402 0 2579894 220 0 58176 0 > 1690 0 774488 267 0 16968 0 > 1323 0 3690650 209 0 83830 0 > 591 0 4519858 92 0 103110 0 > %%% > > There is no sign of any packet loss or switch problems. Forcing > 1000baseTX full-duplex has no effect. Forcing 100baseTX full-duplex > makes the problem more obvious. The mtu is 1500 throughout since > only bge-5701 and sk support jumbo frames and I want to use udp for > nfs. > > 5705/udp is better: > %%% > input (Total) output > packets errs bytes packets errs bytes colls > 5209 0 6607758 846 0 151702 0 > 4763 0 6684546 773 0 153520 0 > 4758 0 6618498 769 0 151298 0 > 3582 0 7057568 576 0 162498 0 > 4935 0 5115068 800 0 116756 0 > 4924 0 6622026 798 0 152802 0 > 4095 0 6018462 657 0 137450 0 > 4647 0 5270442 751 0 120594 0 > 4673 0 5451948 758 0 123624 0 > 2340 0 6001986 372 0 138168 0 > 3750 0 6150610 604 0 140996 0 > %%% > > sk/udp works right: > %%% > input (Total) output > packets errs bytes packets errs bytes colls > 8638 0 12384676 1440 0 293062 0 > 8636 0 12415646 1439 0 293708 0 > 8637 0 12415646 1441 0 293708 0 > 8637 0 12415646 1439 0 293708 0 > 8637 0 12417160 1440 0 293708 0 > 8636 0 12413162 1439 0 293506 0 > 8637 0 12414132 1439 0 293708 0 > 8636 0 12417160 1440 0 293708 0 > 8637 0 12415646 1439 0 293708 0 > 8636 0 12417160 1440 0 293708 0 > 8637 0 12414676 1439 0 293506 0 > %%% > > sk is under ~5.2 with latency/throughput/efficiency optimizations > that don't have much effect here. > > Bruce > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" =2D-=20 /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News --nextPart16990224.mgHh93Ab6g Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQBFsxGnXyyEoT62BG0RAnb8AJwKV9ZihIC9m3XiHwsJLrAcQBa6CQCdHrbD T/L2QEOgFi2qQe5Jte2vKbU= =iMtp -----END PGP SIGNATURE----- --nextPart16990224.mgHh93Ab6g--