From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 21 01:28:03 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6B19D37B401; Mon, 21 Apr 2003 01:28:03 -0700 (PDT) Received: from samson.dc.luth.se (samson.dc.luth.se [130.240.112.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id C44EC43F93; Mon, 21 Apr 2003 01:28:01 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from dc.luth.se (root@bompe.dc.luth.se [130.240.60.42]) by samson.dc.luth.se (8.12.5/8.12.5) with ESMTP id h3L8S0LG003787; Mon, 21 Apr 2003 10:28:00 +0200 (MET DST) Received: from bompe.dc.luth.se (bj@localhost.dc.luth.se [127.0.0.1]) by dc.luth.se (8.12.6/8.11.3) with ESMTP id h3L8Rx2F032265; Mon, 21 Apr 2003 10:27:59 +0200 (CEST) (envelope-from bj@bompe.dc.luth.se) Message-Id: <200304210827.h3L8Rx2F032265@dc.luth.se> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: "Jin Guojun [NCS]" In-reply-to: Your message of Sun, 20 Apr 2003 13:12:42 PDT. <3EA2FF3A.4D86D5CB@lbl.gov> Dcc: From: Borje Josefsson X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Mon, 21 Apr 2003 10:27:59 +0200 Sender: bj@dc.luth.se cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: patch for test (Was: tcp_output starving -- is due to mbuf get delay?) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Apr 2003 08:28:03 -0000 On Sun, 20 Apr 2003 13:12:42 PDT "Jin Guojun [NCS]" wrote: > Now the patch is ready. It has been tested on both 4.7 and 4.8. > For 4.7, one has to manually add an empty line before the comment prior= to the > tcp_output() routine. comment for beginning the tcp_output() in 4.7-RELEASE :-( > > = > Some more hints for tracing: (net.inet.tcp.liondmask is a bitmap) > bit 0-1 (value 1, 2, or 3) is for enabling tcp_output() mbuf chain modi= fication > bit 2 (value 4) is for enabling sbappend() mbuf chain modification > bit 3 (value 8) is for tcp_input (DO NOT TRY IT, it is not ready). > = > bit 9 (value 512) is for enabling check routine (dump errors to /var/lo= g/messag). > = > If you do have problem, set net.inet.tcp.liondmask to 512 and look what= message says. > If you would like to know which part causing problem or not working pro= perly, > set net.inet.tcp.liondmask to 1, 2, 3 or 4 to test individual module. Thanks!! This patch definitively works, and gives much higher PPS (32000 instead o= f = 19000). This is on a low-end system (PIII 900MHz with 33MHz bus), I'll = test one of my larger systems later today. One question though - is there any way of having the code being more = "aggressive"? As You see, in the netstat output below, it takes ~35 = seconds(!) before reaching full speed. On NetBSD I reach maxPPS almost = immediately. Even if we now (with Your patch) can utilize the hardware = much more, it only helps if You have connections that lasts for a very = long time, so that the "ramping" time is not significant. *Note* (the very last output below) that this seems to be highly dependan= t = on RTT. On a 2ms connection (~50 miles) I reach max RTT almost = immediately. (can't explain why I go to 51kpps and then fall back to = 35kpps, this is repeatable). Apart from vanilla 4.8R I have set: kern.ipc.maxsockbuf=3D8388608 net.inet.tcp.sendspace=3D3217968 net.inet.tcp.recvspace=3D3217968 kern.ipc.nmbclusters=3D8192 And this test is done on a connection with RTT in the order of 22 ms. --B=F6rje =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D "netstat 1" **on NetBSD** (for comparat= ion) =3D=3D=3D=3D=3D bge0 in bge0 out total in total out packets errs packets errs colls packets errs packets 1 0 1 0 0 1 0 1 7118 0 11315 0 0 7118 0 11315 18604 0 28014 0 0 18604 0 28014 18610 0 28005 0 0 18611 0 28005 (NOTE that this example is using larger MTU, and not on the same hardware= = as below, but the behaviour of reaching maxPPS "immediately" is the same)= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D "netstat 1" with liondmask=3D7 =3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D input (Total) output packets errs bytes packets errs bytes colls 6 0 540 3 0 228 0 37 0 2712 56 0 72216 0 646 0 42636 823 0 1244686 0 1548 0 102168 1966 0 2975188 0 2432 0 160512 3039 0 4604252 0 3301 0 217866 4193 0 6345352 0 4174 0 275484 5254 0 7950192 0 5011 0 330726 6373 0 9650414 0 5836 0 385176 7448 0 11271908 0 6675 0 440550 8519 0 12896430 0 7528 0 496848 9596 0 14527008 0 8408 0 554928 10626 0 16089456 0 9212 0 607992 11652 0 17636764 0 9962 0 657492 12698 0 19223436 0 10699 0 706134 13694 0 20731380 0 11368 0 750288 14648 0 22175736 0 12144 0 801504 15697 0 23768464 0 12802 0 844932 16693 0 25267324 0 13412 0 885192 17552 0 26576934 0 14001 0 924066 18495 0 28001608 0 14444 0 953304 19415 0 29384230 0 15041 0 992706 20275 0 30701070 0 15681 0 1034946 21327 0 32283200 0 16224 0 1070784 22202 0 33610978 0 16621 0 1096986 22888 0 34651096 0 17050 0 1125300 23568 0 35682130 0 17721 0 1169586 24573 0 37200672 0 18256 0 1204896 25361 0 38401274 0 18782 0 1239612 26128 0 39550400 0 19359 0 1277694 26972 0 40834272 0 20150 0 1329900 28015 0 42413374 0 20900 0 1379400 28962 0 43854702 0 21523 0 1420518 30024 0 45447430 0 22256 0 1468896 30891 0 46767638 0 22882 0 1510212 31655 0 47924334 0 23087 0 1523742 31865 0 48243788 0 23225 0 1532850 32038 0 48502682 0 It seems that I reach the limit about here - 35-36 sec after start 23170 0 1529220 32121 0 48629858 0 23223 0 1532718 32036 0 48501168 0 23200 0 1531200 32121 0 48629858 0 23103 0 1524792 32122 0 48631372 0 23104 0 1524864 32080 0 48565096 0 23214 0 1532124 32079 0 48566270 0 23147 0 1527696 32036 0 48501168 0 10318 0 680988 13543 0 20495142 0 1 0 66 1 0 178 0 1 0 66 1 0 178 0 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D "netstat 1" with liondmask=3D7 =3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D With plain 4.8 (liondmask=3D0) I get: root@stinky 8# netstat 1 input (Total) output packets errs bytes packets errs bytes colls 7 0 732 10 0 2394 0 437 0 28842 556 0 840448 0 1343 0 88638 1669 0 2531586 0 2201 0 145266 2757 0 4166706 0 3082 0 203406 3857 0 5841190 0 4021 0 265386 4959 0 7503562 0 4877 0 321882 6017 0 9111430 0 5621 0 370986 7064 0 10690532 0 6471 0 427086 8136 0 12319596 0 7216 0 476256 9177 0 13889614 0 8006 0 528396 10181 0 15415726 0 8725 0 575850 11215 0 16975146 0 9482 0 625812 12259 0 18561818 0 10205 0 673530 13258 0 20071276 0 10846 0 715836 14115 0 21365746 0 11563 0 763158 15223 0 23046286 0 12399 0 818334 16266 0 24628416 0 13024 0 859584 17119 0 25913802 0 13609 0 898194 17949 0 27173450 0 14316 0 944856 18798 0 28458836 0 14391 0 949806 18842 0 28522764 0 14463 0 954558 19010 0 28779804 0 Here I reach the limit after 20 seconds. 14500 0 957000 19095 0 28908494 0 14534 0 959244 19053 0 28844906 0 14599 0 963534 19052 0 28843392 0 14526 0 958716 19053 0 28844906 0 14484 0 955944 18967 0 28714702 0 14330 0 945780 18968 0 28716216 0 14581 0 962346 19137 0 28972082 0 14531 0 959046 19180 0 29037184 0 14465 0 954690 19095 0 28908494 0 14514 0 957924 19095 0 28908494 0 14403 0 950598 19095 0 28908494 0 14493 0 956538 19052 0 28843392 0 14544 0 959904 19095 0 28908494 0 14546 0 960036 19095 0 28908494 0 14558 0 960828 19095 0 28908494 0 14559 0 960894 19053 0 28844906 0 14597 0 963402 19094 0 28906980 0 14509 0 957594 19053 0 28844906 0 14527 0 958782 19137 0 28972082 0 14576 0 962016 19139 0 28973936 0 14575 0 961950 19096 0 28908494 0 14578 0 962148 19052 0 28843392 0 14519 0 958254 18968 0 28716216 0 14579 0 962214 19052 0 28843392 0 14533 0 959178 19095 0 28908494 0 14588 0 962808 19137 0 28972082 0 14503 0 957198 19053 0 28844906 0 14580 0 962280 19095 0 28908494 0 14479 0 955614 18968 0 28716216 0 14477 0 955482 19052 0 28843392 0 14618 0 964788 19137 0 28972082 0 14569 0 961554 19053 0 28844906 0 14586 0 962676 19095 0 28908494 0 4462 0 294492 5438 0 8224172 0 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D "netstat 1" with liondmask on a 2ms = RTT connection =3D=3D=3D=3D root@stinky 17# netstat 1 input (Total) output packets errs bytes packets errs bytes colls 2 0 132 2 0 0 0 3908 0 258086 7004 0 10856439 0 29353 0 1937298 51940 0 78631282 0 29317 0 1934922 51911 0 78629768 0 29344 0 1936704 51894 0 78502592 0 29340 0 1936440 51841 0 78501078 0 29298 0 1933668 51860 0 78567694 0 29376 0 1938816 51947 0 78629768 0 29344 0 1936704 51928 0 78566180 0 20988 0 1385208 37580 0 56660114 0 19687 0 1299336 35473 0 53704786 0 19705 0 1300530 35431 0 53641198 0 19705 0 1300530 35431 0 53641198 0 19670 0 1298220 35346 0 53512508 0 19680 0 1298880 35388 0 53576096 0