From owner-freebsd-performance@FreeBSD.ORG Sun Apr 20 14:11:07 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B3FE837B401; Sun, 20 Apr 2003 14:11:07 -0700 (PDT) Received: from adsl-63-198-35-122.dsl.snfc21.pacbell.net (adsl-63-198-35-122.dsl.snfc21.pacbell.net [63.198.35.122]) by mx1.FreeBSD.org (Postfix) with ESMTP id C75C843F3F; Sun, 20 Apr 2003 14:11:06 -0700 (PDT) (envelope-from j_guojun@lbl.gov) Received: from lbl.gov (localhost.pacbell.net [127.0.0.1]) ESMTP id h3KKCgCr000515; Sun, 20 Apr 2003 13:12:48 -0700 (PDT) (envelope-from j_guojun@lbl.gov) Sender: jin@adsl-63-198-35-122.dsl.snfc21.pacbell.net Message-ID: <3EA2FF3A.4D86D5CB@lbl.gov> Date: Sun, 20 Apr 2003 13:12:42 -0700 From: "Jin Guojun [NCS]" X-Mailer: Mozilla 4.76 [en] (X11; U; FreeBSD 4.8-RELEASE i386) X-Accept-Language: zh, zh-CN, en-US, en MIME-Version: 1.0 To: bj@dc.luth.se, freebsd-hackers@freebsd.org, freebsd-performance@freebsd.org References: <200304191305.h3JD5S2F026929@dc.luth.se> <3EA20D31.2606389B@lbl.gov> Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Subject: Re: patch for test (Was: tcp_output starving -- is due to mbuf get delay?) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Apr 2003 21:11:08 -0000 Now the patch is ready. It has been tested on both 4.7 and 4.8. For 4.7, one has to manually add an empty line before the comment prior to the tcp_output() routine. Some more hints for tracing: (net.inet.tcp.liondmask is a bitmap) bit 0-1 (value 1, 2, or 3) is for enabling tcp_output() mbuf chain modification bit 2 (value 4) is for enabling sbappend() mbuf chain modification bit 3 (value 8) is for tcp_input (DO NOT TRY IT, it is not ready). bit 9 (value 512) is for enabling check routine (dump errors to /var/log/messag). If you do have problem, set net.inet.tcp.liondmask to 512 and look what message says. If you would like to know which part causing problem or not working properly, set net.inet.tcp.liondmask to 1, 2, 3 or 4 to test individual module. -Jin "Jin Guojun [NCS]" wrote: > Not yet. I said I will send out email when it is ready. > The problem is we have a significantly modified system based on 4.8-RC2. I tried to > extract the only > sockbut/mbuf code, but it apparently, the patch is in completed for pure 4.8 system. > Somehow, it have > some depenency of our new TCP stack. > > So, I am building two new systems, one pure 4.7 and one pure 4.8. I will extract correct > patch for these > system, test them, then send out another email. > > Thanks for the patient. > > -Jin > > Borje Josefsson wrote: > > > Hmm. I'm not sure if I misunderstood if this was ready for another test > > run or not. Anyhow - I took the new patch .tgz (which, btw, still had > > tcp_input.p in it). I applied the patches (except tcp_input) and tested. > > > > Now I get: > > > > Panic: bad cur_off > > 00000 m_p 0xc0a7f400 0xc0a7f400 my_off 0 1448 cc 3407144 > > > > As usual, I'm willing to test more when there are an update available. > > > > --Börje > > > > On Fri, 18 Apr 2003 13:04:24 PDT "Jin Guojun [DSD]" wrote: > > > > > Opps, there was a bad file -- tcp_input.p -- which is not working yet. > > > Also, a patch file -- tcp_usrreq.p -- was missing. > > > > > > I will take the tcp_input.p out and put tcp_usrreq.p in. > > > When it is finished, I will send another mail out. > > > > > > -Jin > > > > > > Borje Josefsson wrote: > > > > > > > On Thu, 17 Apr 2003 22:12:02 PDT "Jin Guojun [NCS]" wrote: > > > > > > > > > I have modified the sockbuf and mbuf operation to double the throughput over > > > > > high bandwidth delay product path. > > > > > > > > > > The patch is available at: > > > > > http://www-didc.lbl.gov/~jin/network/lion/content.html#FreeBSD_Patches > > > > > > > > > > The current modification is for tcp transmission only. > > > > > > > > > > I have adapted some code of uipc_socket2.c from Sam Leffler > > > > > http://www.freebsd.org/~sam/thorpe-stable.patch > > > > > > > > > > for tcp receiver, but it has not been tested yet, so the tcp_input.p is empty. > > > > > > > > > > I ignored all record chain (m_nextpkt) related code. The details is explained at > > > > > > > > > > http://www-didc.lbl.gov/~jin/network/lion/content.html#BSDMbuf > > > > > > > > > > Once the tcp_input code is tested, I will submit the patch to bugs@freebsd.org. > > > > > I may submit the patch regardless if tcp_input code works or not, because the > > > > > tcp > > > > > sender (server) is more important in high-speed network than the receiver > > > > > (client). > > > > > > > > > > It is appreciated if any one can verify the patch and provide feedback. > > > > > > > > OK. I have now tried this patch on a newly-installed 4.8R. The patch > > > > applied fine. When the sysctl net.inet.tcp.liondmask is unset, everything > > > > seems OK, but when setting it to 7 (as specified with the patch > > > > instructions) i get: > > > > > > > > Fatal trap 12: page fault while in kernel mode. > > > > (I could write down all the stuff on addresses etc if it makes sense) > > > > > > > > when I run ttcp to test the performance. > > > > > > > > This is repeatable. > > > > > > > > I'm willing to test more, if someone provides me with some hints on what > > > > to do. > > > > > > > > --Börje From owner-freebsd-performance@FreeBSD.ORG Mon Apr 21 01:28:03 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6B19D37B401; Mon, 21 Apr 2003 01:28:03 -0700 (PDT) Received: from samson.dc.luth.se (samson.dc.luth.se [130.240.112.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id C44EC43F93; Mon, 21 Apr 2003 01:28:01 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from dc.luth.se (root@bompe.dc.luth.se [130.240.60.42]) by samson.dc.luth.se (8.12.5/8.12.5) with ESMTP id h3L8S0LG003787; Mon, 21 Apr 2003 10:28:00 +0200 (MET DST) Received: from bompe.dc.luth.se (bj@localhost.dc.luth.se [127.0.0.1]) by dc.luth.se (8.12.6/8.11.3) with ESMTP id h3L8Rx2F032265; Mon, 21 Apr 2003 10:27:59 +0200 (CEST) (envelope-from bj@bompe.dc.luth.se) Message-Id: <200304210827.h3L8Rx2F032265@dc.luth.se> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: "Jin Guojun [NCS]" In-reply-to: Your message of Sun, 20 Apr 2003 13:12:42 PDT. <3EA2FF3A.4D86D5CB@lbl.gov> Dcc: From: Borje Josefsson X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Mon, 21 Apr 2003 10:27:59 +0200 Sender: bj@dc.luth.se cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: patch for test (Was: tcp_output starving -- is due to mbuf get delay?) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Apr 2003 08:28:03 -0000 On Sun, 20 Apr 2003 13:12:42 PDT "Jin Guojun [NCS]" wrote: > Now the patch is ready. It has been tested on both 4.7 and 4.8. > For 4.7, one has to manually add an empty line before the comment prior= to the > tcp_output() routine. comment for beginning the tcp_output() in 4.7-RELEASE :-( > > = > Some more hints for tracing: (net.inet.tcp.liondmask is a bitmap) > bit 0-1 (value 1, 2, or 3) is for enabling tcp_output() mbuf chain modi= fication > bit 2 (value 4) is for enabling sbappend() mbuf chain modification > bit 3 (value 8) is for tcp_input (DO NOT TRY IT, it is not ready). > = > bit 9 (value 512) is for enabling check routine (dump errors to /var/lo= g/messag). > = > If you do have problem, set net.inet.tcp.liondmask to 512 and look what= message says. > If you would like to know which part causing problem or not working pro= perly, > set net.inet.tcp.liondmask to 1, 2, 3 or 4 to test individual module. Thanks!! This patch definitively works, and gives much higher PPS (32000 instead o= f = 19000). This is on a low-end system (PIII 900MHz with 33MHz bus), I'll = test one of my larger systems later today. One question though - is there any way of having the code being more = "aggressive"? As You see, in the netstat output below, it takes ~35 = seconds(!) before reaching full speed. On NetBSD I reach maxPPS almost = immediately. Even if we now (with Your patch) can utilize the hardware = much more, it only helps if You have connections that lasts for a very = long time, so that the "ramping" time is not significant. *Note* (the very last output below) that this seems to be highly dependan= t = on RTT. On a 2ms connection (~50 miles) I reach max RTT almost = immediately. (can't explain why I go to 51kpps and then fall back to = 35kpps, this is repeatable). Apart from vanilla 4.8R I have set: kern.ipc.maxsockbuf=3D8388608 net.inet.tcp.sendspace=3D3217968 net.inet.tcp.recvspace=3D3217968 kern.ipc.nmbclusters=3D8192 And this test is done on a connection with RTT in the order of 22 ms. --B=F6rje =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D "netstat 1" **on NetBSD** (for comparat= ion) =3D=3D=3D=3D=3D bge0 in bge0 out total in total out packets errs packets errs colls packets errs packets 1 0 1 0 0 1 0 1 7118 0 11315 0 0 7118 0 11315 18604 0 28014 0 0 18604 0 28014 18610 0 28005 0 0 18611 0 28005 (NOTE that this example is using larger MTU, and not on the same hardware= = as below, but the behaviour of reaching maxPPS "immediately" is the same)= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D "netstat 1" with liondmask=3D7 =3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D input (Total) output packets errs bytes packets errs bytes colls 6 0 540 3 0 228 0 37 0 2712 56 0 72216 0 646 0 42636 823 0 1244686 0 1548 0 102168 1966 0 2975188 0 2432 0 160512 3039 0 4604252 0 3301 0 217866 4193 0 6345352 0 4174 0 275484 5254 0 7950192 0 5011 0 330726 6373 0 9650414 0 5836 0 385176 7448 0 11271908 0 6675 0 440550 8519 0 12896430 0 7528 0 496848 9596 0 14527008 0 8408 0 554928 10626 0 16089456 0 9212 0 607992 11652 0 17636764 0 9962 0 657492 12698 0 19223436 0 10699 0 706134 13694 0 20731380 0 11368 0 750288 14648 0 22175736 0 12144 0 801504 15697 0 23768464 0 12802 0 844932 16693 0 25267324 0 13412 0 885192 17552 0 26576934 0 14001 0 924066 18495 0 28001608 0 14444 0 953304 19415 0 29384230 0 15041 0 992706 20275 0 30701070 0 15681 0 1034946 21327 0 32283200 0 16224 0 1070784 22202 0 33610978 0 16621 0 1096986 22888 0 34651096 0 17050 0 1125300 23568 0 35682130 0 17721 0 1169586 24573 0 37200672 0 18256 0 1204896 25361 0 38401274 0 18782 0 1239612 26128 0 39550400 0 19359 0 1277694 26972 0 40834272 0 20150 0 1329900 28015 0 42413374 0 20900 0 1379400 28962 0 43854702 0 21523 0 1420518 30024 0 45447430 0 22256 0 1468896 30891 0 46767638 0 22882 0 1510212 31655 0 47924334 0 23087 0 1523742 31865 0 48243788 0 23225 0 1532850 32038 0 48502682 0 It seems that I reach the limit about here - 35-36 sec after start 23170 0 1529220 32121 0 48629858 0 23223 0 1532718 32036 0 48501168 0 23200 0 1531200 32121 0 48629858 0 23103 0 1524792 32122 0 48631372 0 23104 0 1524864 32080 0 48565096 0 23214 0 1532124 32079 0 48566270 0 23147 0 1527696 32036 0 48501168 0 10318 0 680988 13543 0 20495142 0 1 0 66 1 0 178 0 1 0 66 1 0 178 0 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D "netstat 1" with liondmask=3D7 =3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D With plain 4.8 (liondmask=3D0) I get: root@stinky 8# netstat 1 input (Total) output packets errs bytes packets errs bytes colls 7 0 732 10 0 2394 0 437 0 28842 556 0 840448 0 1343 0 88638 1669 0 2531586 0 2201 0 145266 2757 0 4166706 0 3082 0 203406 3857 0 5841190 0 4021 0 265386 4959 0 7503562 0 4877 0 321882 6017 0 9111430 0 5621 0 370986 7064 0 10690532 0 6471 0 427086 8136 0 12319596 0 7216 0 476256 9177 0 13889614 0 8006 0 528396 10181 0 15415726 0 8725 0 575850 11215 0 16975146 0 9482 0 625812 12259 0 18561818 0 10205 0 673530 13258 0 20071276 0 10846 0 715836 14115 0 21365746 0 11563 0 763158 15223 0 23046286 0 12399 0 818334 16266 0 24628416 0 13024 0 859584 17119 0 25913802 0 13609 0 898194 17949 0 27173450 0 14316 0 944856 18798 0 28458836 0 14391 0 949806 18842 0 28522764 0 14463 0 954558 19010 0 28779804 0 Here I reach the limit after 20 seconds. 14500 0 957000 19095 0 28908494 0 14534 0 959244 19053 0 28844906 0 14599 0 963534 19052 0 28843392 0 14526 0 958716 19053 0 28844906 0 14484 0 955944 18967 0 28714702 0 14330 0 945780 18968 0 28716216 0 14581 0 962346 19137 0 28972082 0 14531 0 959046 19180 0 29037184 0 14465 0 954690 19095 0 28908494 0 14514 0 957924 19095 0 28908494 0 14403 0 950598 19095 0 28908494 0 14493 0 956538 19052 0 28843392 0 14544 0 959904 19095 0 28908494 0 14546 0 960036 19095 0 28908494 0 14558 0 960828 19095 0 28908494 0 14559 0 960894 19053 0 28844906 0 14597 0 963402 19094 0 28906980 0 14509 0 957594 19053 0 28844906 0 14527 0 958782 19137 0 28972082 0 14576 0 962016 19139 0 28973936 0 14575 0 961950 19096 0 28908494 0 14578 0 962148 19052 0 28843392 0 14519 0 958254 18968 0 28716216 0 14579 0 962214 19052 0 28843392 0 14533 0 959178 19095 0 28908494 0 14588 0 962808 19137 0 28972082 0 14503 0 957198 19053 0 28844906 0 14580 0 962280 19095 0 28908494 0 14479 0 955614 18968 0 28716216 0 14477 0 955482 19052 0 28843392 0 14618 0 964788 19137 0 28972082 0 14569 0 961554 19053 0 28844906 0 14586 0 962676 19095 0 28908494 0 4462 0 294492 5438 0 8224172 0 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D "netstat 1" with liondmask on a 2ms = RTT connection =3D=3D=3D=3D root@stinky 17# netstat 1 input (Total) output packets errs bytes packets errs bytes colls 2 0 132 2 0 0 0 3908 0 258086 7004 0 10856439 0 29353 0 1937298 51940 0 78631282 0 29317 0 1934922 51911 0 78629768 0 29344 0 1936704 51894 0 78502592 0 29340 0 1936440 51841 0 78501078 0 29298 0 1933668 51860 0 78567694 0 29376 0 1938816 51947 0 78629768 0 29344 0 1936704 51928 0 78566180 0 20988 0 1385208 37580 0 56660114 0 19687 0 1299336 35473 0 53704786 0 19705 0 1300530 35431 0 53641198 0 19705 0 1300530 35431 0 53641198 0 19670 0 1298220 35346 0 53512508 0 19680 0 1298880 35388 0 53576096 0 From owner-freebsd-performance@FreeBSD.ORG Mon Apr 21 02:24:32 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0C0AB37B401; Mon, 21 Apr 2003 02:24:32 -0700 (PDT) Received: from samson.dc.luth.se (samson.dc.luth.se [130.240.112.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id ABCD343FD7; Mon, 21 Apr 2003 02:24:30 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from dc.luth.se (root@bompe.dc.luth.se [130.240.60.42]) by samson.dc.luth.se (8.12.5/8.12.5) with ESMTP id h3L9OTLG010830; Mon, 21 Apr 2003 11:24:29 +0200 (MET DST) Received: from bompe.dc.luth.se (bj@localhost.dc.luth.se [127.0.0.1]) by dc.luth.se (8.12.6/8.11.3) with ESMTP id h3L9OT2F032404; Mon, 21 Apr 2003 11:24:29 +0200 (CEST) (envelope-from bj@bompe.dc.luth.se) Message-Id: <200304210924.h3L9OT2F032404@dc.luth.se> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: "Jin Guojun [NCS]" In-reply-to: Your message of Mon, 21 Apr 2003 10:27:59 +0200. <200304210827.h3L8Rx2F032265@dc.luth.se> Dcc: From: Borje Josefsson X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Mon, 21 Apr 2003 11:24:29 +0200 Sender: bj@dc.luth.se cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: patch for test (Was: tcp_output starving -- is due to mbuf get delay?) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Apr 2003 09:24:32 -0000 On Mon, 21 Apr 2003 10:27:59 +0200 Borje Josefsson wrote: > This patch definitively works, and gives much higher PPS (32000 > instead of 19000). This is on a low-end system (PIII 900MHz with > 33MHz bus), I'll test one of my larger systems later today. = OK. I have now tested on a larger system. Result is better than without the patch, but *not* as good as (for = example) NetBSD or Linux. Value Before patch After patch NetBSD Mbit/sec 617 838 921 PPS (MTU=3D4470) 20000 27500 28000 The problem is (still) that I run out of CPU on the FreeBSD *sender*. Thi= s = doesn't happen on NetBSD (same hardware). The hardware is Xeon 2,8GHz, = PCI-X bus, connected directly to the core routers of a 10 Gbps network. = RTT=3D21 ms, MTU=3D4470. OS=3DFreeBSD 4.8RC with Your patch applied. wilma % vmstat 1 (edited to shorten lines) memory page faults cpu avm fre flt re pi po fr sr in sy cs us sy id 8608 977836 4 0 0 0 0 0 233 20 7 0 2 98 12192 977836 4 0 0 0 0 0 237 59 16 0 1 99 12192 977836 4 0 0 0 0 0 233 20 8 0 2 98 12636 977608 78 0 0 0 7 0 2377 870 241 0 28 72 12636 977608 4 0 0 0 0 0 6522 1834 19 0 100 0 12636 977608 4 0 0 0 0 0 6531 1816 19 0 100 0 12636 977608 4 0 0 0 0 0 6499 1827 19 0 100 0 12636 977608 4 0 0 0 0 0 6575 1821 21 0 100 0 13044 977608 6 0 0 0 0 0 6611 1825 21 0 100 0 top(1) shows: CPU states: 0.0% user, 0.0% nice, 93.4% system, 6.6% interrupt, 0.0% = idle Mem: 6136K Active, 8920K Inact, 34M Wired, 64K Cache, 9600K Buf, 954M Fre= e Swap: 2048M Total, 2048M Free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 215 root 43 0 1024K 652K RUN 0:11 92.37% 39.11% ttcp Compare that to when I use NetBSD as sender: CPU states: 0.0% user, 0.0% nice, 6.5% system, 5.5% interrupt, 88.1% = idle Memory: 39M Act, 12K Inact, 628K Wired, 2688K Exec, 5488K File, 399M Free= Swap: 1025M Total, 1025M Free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 17938 root 2 0 204K 688K netio 0:00 7.80% 1.42% ttcp = The "slow ramping" effect that I described in my earlier letter is not at= = all as visible here, so that might be something else (my small test syste= m = has some switches between itself and the core). bge0 in bge0 out total in total out = packets errs packets errs colls packets errs packets errs colls 6 0 4 0 0 7 0 4 0 0 18364 0 12525 0 0 18364 0 12525 0 0 27664 0 18861 0 0 27665 0 18861 0 0 27511 0 18749 0 0 27511 0 18749 0 0 27281 0 18572 0 0 27282 0 18572 0 0 Net result: Much better, but not as good as the "competitors"... --B=F6rje From owner-freebsd-performance@FreeBSD.ORG Mon Apr 21 10:26:28 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 55B7B37B401; Mon, 21 Apr 2003 10:26:28 -0700 (PDT) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9837043F85; Mon, 21 Apr 2003 10:26:27 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0061.cvx22-bradley.dialup.earthlink.net ([209.179.198.61] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 197f3d-0002kH-00; Mon, 21 Apr 2003 10:26:22 -0700 Message-ID: <3EA4296B.ACCD9AC8@mindspring.com> Date: Mon, 21 Apr 2003 10:24:59 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: bj@dc.luth.se References: <200304210827.h3L8Rx2F032265@dc.luth.se> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a42f8eb4703b79c4256e711cd8da995129a7ce0e8f8d31aa3f350badd9bab72f9c350badd9bab72f9c cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: patch for test (Was: tcp_output starving -- is due to mbuf get delay?) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Apr 2003 17:26:28 -0000 Borje Josefsson wrote: [ ... Jin Guojun's TCP output patch for high bandwidth delay product ... ] > This patch definitively works, and gives much higher PPS (32000 instead of > 19000). This is on a low-end system (PIII 900MHz with 33MHz bus), I'll > test one of my larger systems later today. > > One question though - is there any way of having the code being more > "aggressive"? As You see, in the netstat output below, it takes ~35 > seconds(!) before reaching full speed. On NetBSD I reach maxPPS almost > immediately. Even if we now (with Your patch) can utilize the hardware > much more, it only helps if You have connections that lasts for a very > long time, so that the "ramping" time is not significant. You can get immediate relief by porting this code instead of using the patch: http://www.psc.edu/networking/tcp.html#psc It is for NetBSD 1.3.2, and includes a SACK, Rate Halving, auto-tuning, and explicit congestion notification: Description: http://www.psc.edu/networking/rate_halving.html Direct link to the code: http://www.psc.edu/networking/ftp/tools/netbsd132_rh_10.tgz Also included is a FACK implementation. -- Terry From owner-freebsd-performance@FreeBSD.ORG Mon Apr 21 16:39:54 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 56D6C37B401; Mon, 21 Apr 2003 16:39:54 -0700 (PDT) Received: from haldjas.folklore.ee (Haldjas.folklore.ee [193.40.6.121]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2153843FBD; Mon, 21 Apr 2003 16:39:52 -0700 (PDT) (envelope-from narvi@haldjas.folklore.ee) Received: from haldjas.folklore.ee (localhost [127.0.0.1]) by haldjas.folklore.ee (8.12.3/8.11.3) with ESMTP id h3LNdoUE031404; Tue, 22 Apr 2003 02:39:51 +0300 (EEST) (envelope-from narvi@haldjas.folklore.ee) Received: from localhost (narvi@localhost)h3LNdo57031401; Tue, 22 Apr 2003 02:39:50 +0300 (EEST) Date: Tue, 22 Apr 2003 02:39:50 +0300 (EEST) From: Narvi To: Alex Semenyaka In-Reply-To: <20030420011039.GC52081@snark.ratmir.ru> Message-ID: <20030422023703.G29990-100000@haldjas.folklore.ee> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Mailman-Approved-At: Mon, 21 Apr 2003 16:43:50 -0700 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: freebsd-current@freebsd.org Subject: Re: /bin/sh and 32-bit arithmetics [CORRECTED] X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Apr 2003 23:39:54 -0000 On Sun, 20 Apr 2003, Alex Semenyaka wrote: > 7 5.20 6.37 3.51 22.50% i=$(($i<<1)) > 8 5.25 6.42 3.51 22.27% i=$(($i<<$m)) > > As you can see, even for arithmetic-only script the overhead is not too big > except with one case: shift operation. I decided to investigate is it usual > script operation. I've went through all scripts I could find in my FreeBSD > box. I've searched them with "locate .sh | grep '\.sh$'". There were a lot > of them: > > $ locate .sh | grep '\.sh$' | wc -l > 1637 > > But there was no any script that uses the shift operation. Good, but not > enough. I've take the script that uses arithmetics and do some other job, > ttfadmin.sh from the Abiword package. I've run in 10000 times in the loop > with both (64-bit and 32-bit) shells. As an argument it received empty > directory so no work has been done, just run, check pars, found no files, > exit. It takes 65.35 seconds in the first case and 65.30 second in the second > one. So the the time that arithmetics takes during the real script execution > is too small in comparison to total running time (obviously: arithmetics > is in-core calculations while any script usually run some external programs > etc, and at least I/O is involved). Ahem - wouldn't it be easier to find out *why* the dramatic speed-down happens and trying to combat it as opposed to trying to show the speed-down is not releavant? There shouldn't be anything inherently that much slower in 64 bit shifts... > > Thanks! > > SY, Alex > From owner-freebsd-performance@FreeBSD.ORG Mon Apr 21 20:33:57 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7C2C937B401; Mon, 21 Apr 2003 20:33:57 -0700 (PDT) Received: from adsl-63-198-35-122.dsl.snfc21.pacbell.net (adsl-63-198-35-122.dsl.snfc21.pacbell.net [63.198.35.122]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9ED9743FB1; Mon, 21 Apr 2003 20:33:55 -0700 (PDT) (envelope-from j_guojun@lbl.gov) Received: from lbl.gov (localhost.pacbell.net [127.0.0.1]) ESMTP id h3M2ZW8l000382; Mon, 21 Apr 2003 19:35:38 -0700 (PDT) (envelope-from j_guojun@lbl.gov) Sender: jin@adsl-63-198-35-122.dsl.snfc21.pacbell.net Message-ID: <3EA4AA74.F9993276@lbl.gov> Date: Mon, 21 Apr 2003 19:35:32 -0700 From: "Jin Guojun [NCS]" X-Mailer: Mozilla 4.76 [en] (X11; U; FreeBSD 4.8-RELEASE i386) X-Accept-Language: zh, zh-CN, en-US, en MIME-Version: 1.0 To: bj@dc.luth.se References: <200304210827.h3L8Rx2F032265@dc.luth.se> Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: patch for test (Was: tcp_output starving -- is due to mbuf get delay?) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Apr 2003 03:33:58 -0000 It is hard to compare your netstat output due to short NetBSD output. In NetBSD, it is too short to know if TCP output has been saturated (reach the maximum packet/sec?) at second 3,it reached 18.6Kpkt/s or 28KB/s, which means MTU = 28KB / 18.5 = 1500, right? The FreeBSD seems did better job, at second 2, it already reached 72KB/s. The pkt/s is low because you had Jumbo frame. The net.inet.tcp.liondmask=7 has doubled your TCP window from 1,314,022 to 2,204,667, which is a fully opened cwnd for 22ms + 1Gb/s path. There is nothing to be better than that. The only thing left to chew your CPU is the memory copy. On my web page, it shows that we have reduced all mbuf chain overhead, but there is still a second memory copy overhead, which is can be also reduced. However, this is no a patch any more. It requires to modify mbuf operation. I am BCC this to core@freebsd.org, but I am not sure it will get throughput. To reduce the second memory copy, the mbuf structure needs to have another flag -- EOP -- end of packet. At this point, in xxx_usr_send(), we can simply copy each t_maxseg to a mbuf chain, and set EOP bit in the mbuf flags, then chain this mbuf into the sb_mb. At the tcp_output(), where I modified the mbuf chain for m_copydata & m_copy, get rid of these two copy routines, and simply hand the m to the if_queue. Since we just pass the handle, when NIC driver passes the mbuf to the m_free, m_free will do nothing for these mbufs since EOP is set. Therefore, we will reduce the mbuf operation on both en_queue and m_free. This will left only one memory copy. For a system with 64-bit PCI chipset, this will not be a bottleneck at all. Of course, we can further reduce this one (to make zero copy TCP). Lock (wire) down the user buffer, and simply assign the user space to the mcluster as E_USR_EXT. This may be completely make sense since future computer will have large memory (at least 1GB), where rarely have some applications write a buffer large than 1MB at once (typically 64KB up to 640KB). So lock down 0.1% of total system memory is not a bad thing. If you want even better, the new TCP (Lion) stack is going for that goal, but it will not be available till it is stabilized. As Terry mentioned, now, you may try to play the NetBSD TCP stack first since you have seen that NetBSD does a better job, and provide some feedback. -Jin Borje Josefsson wrote: > On Sun, 20 Apr 2003 13:12:42 PDT "Jin Guojun [NCS]" wrote: > > > Now the patch is ready. It has been tested on both 4.7 and 4.8. > > For 4.7, one has to manually add an empty line before the comment prior to the > > tcp_output() routine. > comment for beginning the tcp_output() in 4.7-RELEASE :-( > > > > > Some more hints for tracing: (net.inet.tcp.liondmask is a bitmap) > > bit 0-1 (value 1, 2, or 3) is for enabling tcp_output() mbuf chain modification > > bit 2 (value 4) is for enabling sbappend() mbuf chain modification > > bit 3 (value 8) is for tcp_input (DO NOT TRY IT, it is not ready). > > > > bit 9 (value 512) is for enabling check routine (dump errors to /var/log/messag). > > > > If you do have problem, set net.inet.tcp.liondmask to 512 and look what message says. > > If you would like to know which part causing problem or not working properly, > > set net.inet.tcp.liondmask to 1, 2, 3 or 4 to test individual module. > > Thanks!! > > This patch definitively works, and gives much higher PPS (32000 instead of > 19000). This is on a low-end system (PIII 900MHz with 33MHz bus), I'll > test one of my larger systems later today. > > One question though - is there any way of having the code being more > "aggressive"? As You see, in the netstat output below, it takes ~35 > seconds(!) before reaching full speed. On NetBSD I reach maxPPS almost > immediately. Even if we now (with Your patch) can utilize the hardware > much more, it only helps if You have connections that lasts for a very > long time, so that the "ramping" time is not significant. > > *Note* (the very last output below) that this seems to be highly dependant > on RTT. On a 2ms connection (~50 miles) I reach max RTT almost > immediately. (can't explain why I go to 51kpps and then fall back to > 35kpps, this is repeatable). > > Apart from vanilla 4.8R I have set: > > kern.ipc.maxsockbuf=8388608 > net.inet.tcp.sendspace=3217968 > net.inet.tcp.recvspace=3217968 > kern.ipc.nmbclusters=8192 > > And this test is done on a connection with RTT in the order of 22 ms. > > --Börje > > =========== "netstat 1" **on NetBSD** (for comparation) ===== > > bge0 in bge0 out total in total out > packets errs packets errs colls packets errs packets > 1 0 1 0 0 1 0 1 > 7118 0 11315 0 0 7118 0 11315 > 18604 0 28014 0 0 18604 0 28014 > 18610 0 28005 0 0 18611 0 28005 > > (NOTE that this example is using larger MTU, and not on the same hardware > as below, but the behaviour of reaching maxPPS "immediately" is the same) > > =========== "netstat 1" with liondmask=7 ================ > > input (Total) output > packets errs bytes packets errs bytes colls > 6 0 540 3 0 228 0 > 37 0 2712 56 0 72216 0 > 646 0 42636 823 0 1244686 0 > 1548 0 102168 1966 0 2975188 0 > 2432 0 160512 3039 0 4604252 0 > 3301 0 217866 4193 0 6345352 0 > 4174 0 275484 5254 0 7950192 0 > 5011 0 330726 6373 0 9650414 0 > 5836 0 385176 7448 0 11271908 0 > 6675 0 440550 8519 0 12896430 0 > 7528 0 496848 9596 0 14527008 0 > 8408 0 554928 10626 0 16089456 0 > 9212 0 607992 11652 0 17636764 0 > 9962 0 657492 12698 0 19223436 0 > 10699 0 706134 13694 0 20731380 0 > 11368 0 750288 14648 0 22175736 0 > 12144 0 801504 15697 0 23768464 0 > 12802 0 844932 16693 0 25267324 0 > 13412 0 885192 17552 0 26576934 0 > 14001 0 924066 18495 0 28001608 0 > 14444 0 953304 19415 0 29384230 0 > 15041 0 992706 20275 0 30701070 0 > 15681 0 1034946 21327 0 32283200 0 > 16224 0 1070784 22202 0 33610978 0 > 16621 0 1096986 22888 0 34651096 0 > 17050 0 1125300 23568 0 35682130 0 > 17721 0 1169586 24573 0 37200672 0 > 18256 0 1204896 25361 0 38401274 0 > 18782 0 1239612 26128 0 39550400 0 > 19359 0 1277694 26972 0 40834272 0 > 20150 0 1329900 28015 0 42413374 0 > 20900 0 1379400 28962 0 43854702 0 > 21523 0 1420518 30024 0 45447430 0 > 22256 0 1468896 30891 0 46767638 0 > 22882 0 1510212 31655 0 47924334 0 > 23087 0 1523742 31865 0 48243788 0 > 23225 0 1532850 32038 0 48502682 0 > > It seems that I reach the limit about here - 35-36 sec after start > > 23170 0 1529220 32121 0 48629858 0 > 23223 0 1532718 32036 0 48501168 0 > 23200 0 1531200 32121 0 48629858 0 > 23103 0 1524792 32122 0 48631372 0 > 23104 0 1524864 32080 0 48565096 0 > 23214 0 1532124 32079 0 48566270 0 > 23147 0 1527696 32036 0 48501168 0 > 10318 0 680988 13543 0 20495142 0 > 1 0 66 1 0 178 0 > 1 0 66 1 0 178 0 > > =========== "netstat 1" with liondmask=7 ================ > > With plain 4.8 (liondmask=0) I get: > > root@stinky 8# netstat 1 > input (Total) output > packets errs bytes packets errs bytes colls > 7 0 732 10 0 2394 0 > 437 0 28842 556 0 840448 0 > 1343 0 88638 1669 0 2531586 0 > 2201 0 145266 2757 0 4166706 0 > 3082 0 203406 3857 0 5841190 0 > 4021 0 265386 4959 0 7503562 0 > 4877 0 321882 6017 0 9111430 0 > 5621 0 370986 7064 0 10690532 0 > 6471 0 427086 8136 0 12319596 0 > 7216 0 476256 9177 0 13889614 0 > 8006 0 528396 10181 0 15415726 0 > 8725 0 575850 11215 0 16975146 0 > 9482 0 625812 12259 0 18561818 0 > 10205 0 673530 13258 0 20071276 0 > 10846 0 715836 14115 0 21365746 0 > 11563 0 763158 15223 0 23046286 0 > 12399 0 818334 16266 0 24628416 0 > 13024 0 859584 17119 0 25913802 0 > 13609 0 898194 17949 0 27173450 0 > 14316 0 944856 18798 0 28458836 0 > 14391 0 949806 18842 0 28522764 0 > 14463 0 954558 19010 0 28779804 0 > > Here I reach the limit after 20 seconds. > > 14500 0 957000 19095 0 28908494 0 > 14534 0 959244 19053 0 28844906 0 > 14599 0 963534 19052 0 28843392 0 > 14526 0 958716 19053 0 28844906 0 > 14484 0 955944 18967 0 28714702 0 > 14330 0 945780 18968 0 28716216 0 > 14581 0 962346 19137 0 28972082 0 > 14531 0 959046 19180 0 29037184 0 > 14465 0 954690 19095 0 28908494 0 > 14514 0 957924 19095 0 28908494 0 > 14403 0 950598 19095 0 28908494 0 > 14493 0 956538 19052 0 28843392 0 > 14544 0 959904 19095 0 28908494 0 > 14546 0 960036 19095 0 28908494 0 > 14558 0 960828 19095 0 28908494 0 > 14559 0 960894 19053 0 28844906 0 > 14597 0 963402 19094 0 28906980 0 > 14509 0 957594 19053 0 28844906 0 > 14527 0 958782 19137 0 28972082 0 > 14576 0 962016 19139 0 28973936 0 > 14575 0 961950 19096 0 28908494 0 > 14578 0 962148 19052 0 28843392 0 > 14519 0 958254 18968 0 28716216 0 > 14579 0 962214 19052 0 28843392 0 > 14533 0 959178 19095 0 28908494 0 > 14588 0 962808 19137 0 28972082 0 > 14503 0 957198 19053 0 28844906 0 > 14580 0 962280 19095 0 28908494 0 > 14479 0 955614 18968 0 28716216 0 > 14477 0 955482 19052 0 28843392 0 > 14618 0 964788 19137 0 28972082 0 > 14569 0 961554 19053 0 28844906 0 > 14586 0 962676 19095 0 28908494 0 > 4462 0 294492 5438 0 8224172 0 > > ============ "netstat 1" with liondmask on a 2ms RTT connection ==== > > root@stinky 17# netstat 1 > input (Total) output > packets errs bytes packets errs bytes colls > 2 0 132 2 0 0 0 > 3908 0 258086 7004 0 10856439 0 > 29353 0 1937298 51940 0 78631282 0 > 29317 0 1934922 51911 0 78629768 0 > 29344 0 1936704 51894 0 78502592 0 > 29340 0 1936440 51841 0 78501078 0 > 29298 0 1933668 51860 0 78567694 0 > 29376 0 1938816 51947 0 78629768 0 > 29344 0 1936704 51928 0 78566180 0 > 20988 0 1385208 37580 0 56660114 0 > 19687 0 1299336 35473 0 53704786 0 > 19705 0 1300530 35431 0 53641198 0 > 19705 0 1300530 35431 0 53641198 0 > 19670 0 1298220 35346 0 53512508 0 > 19680 0 1298880 35388 0 53576096 0 From owner-freebsd-performance@FreeBSD.ORG Tue Apr 22 04:24:25 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A740537B401 for ; Tue, 22 Apr 2003 04:24:25 -0700 (PDT) Received: from mail.svenskabutiker.se (ns.svenskabutiker.se [212.247.101.67]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6806543FEA for ; Tue, 22 Apr 2003 04:24:24 -0700 (PDT) (envelope-from martin@mullet.se) Received: from mullet.se (h118n1fls31o985.telia.com [213.65.16.118]) by mail.svenskabutiker.se (Postfix) with ESMTP id A3D221F02; Tue, 22 Apr 2003 13:24:21 +0200 (CEST) Message-ID: <3EA52696.1090308@mullet.se> Date: Tue, 22 Apr 2003 13:25:10 +0200 From: Martin Nilsson User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3) Gecko/20030312 X-Accept-Language: sv, en-us, en MIME-Version: 1.0 To: argo References: <1050669889.575.12.camel@station.purk.ee> In-Reply-To: <1050669889.575.12.camel@station.purk.ee> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit cc: freebsd-performance@freebsd.org Subject: Re: aic7892 trouble X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Apr 2003 11:24:25 -0000 argo wrote: > I have Adaptec U160 Host-Adapter with 2x Atlas 10k disks.When i copy > large files (~700MB) then performance is pretty bad.I got only about > 35MB/sec between those 2 disks. Only? The Atlas 10K3 disk have an outer STR of ~55MB/s and an inner STR of ~35MB/s. The STR is the rate at which you can read the same track over and over again. In normal use you will have to reposition the heads from tack to track and the disk doing the writing will probably not be as fast as the one reading. For an indepth analysis of the Atlas 10K3 see: http://www.storagereview.com/articles/200107/20010711KW073L8_1.html What your test shows is that FreeBSD is very efficient and it does not degrade performance in any way. If you want a faster STR you have to buy newer disks or use RAID1 (or RAID5). /Martin > Thanks in advance. > > ahc0: port 0xa000-0xa0ff mem > 0xe2024000-0xe2024fff irq 11 at device 9.0 on pci0 > aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs > > da0 at ahc0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-3 device > da0: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged -- Martin Nilsson, CTO & Founder, Mullet Scandinavia AB, Malmö, SWEDEN E-mail: martin@mullet.se, Phone: +46-(0)708-606170, http://www.mullet.se Our business is well engineered servers optimized for FreeBSD and Linux. From owner-freebsd-performance@FreeBSD.ORG Tue Apr 22 07:45:31 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8EB2A37B401; Tue, 22 Apr 2003 07:45:31 -0700 (PDT) Received: from snark.ratmir.ru (snark.ratmir.ru [213.24.248.177]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2639643FBD; Tue, 22 Apr 2003 07:45:30 -0700 (PDT) (envelope-from alexs@snark.ratmir.ru) Received: from snark.ratmir.ru (alexs@localhost [127.0.0.1]) by snark.ratmir.ru (8.12.9/8.12.9) with ESMTP id h3MEjRC2005098; Tue, 22 Apr 2003 18:45:28 +0400 (MSD) (envelope-from alexs@snark.ratmir.ru) Received: (from alexs@localhost) by snark.ratmir.ru (8.12.9/8.12.9/Submit) id h3MEjRcx005097; Tue, 22 Apr 2003 18:45:27 +0400 (MSD) Date: Tue, 22 Apr 2003 18:45:26 +0400 From: Alex Semenyaka To: Narvi Message-ID: <20030422144526.GD4968@snark.ratmir.ru> References: <20030420011039.GC52081@snark.ratmir.ru> <20030422023703.G29990-100000@haldjas.folklore.ee> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030422023703.G29990-100000@haldjas.folklore.ee> User-Agent: Mutt/1.5.4i cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: freebsd-current@freebsd.org cc: Alex Semenyaka Subject: Re: /bin/sh and 32-bit arithmetics [CORRECTED] X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Apr 2003 14:45:32 -0000 On Tue, Apr 22, 2003 at 02:39:50AM +0300, Narvi wrote: > Ahem - wouldn't it be easier to find out *why* the dramatic speed-down > happens and trying to combat it as opposed to trying to show the > speed-down is not releavant? There shouldn't be anything inherently that > much slower in 64 bit shifts... One again: that speed-down is the effect of the disk operations which are slower than in-core arithmetics in the ORDERS of magnitude. When you run any external program from the disk those operations are _always_ included. My point was: since any real script executes at least several external programs the total time it runs will not be affected with the substitution of 32-bit arithmetucs to 64-bit one, even when overflow checks are enabled. I am not concerning about the speed of the external application running, for I am solving the absolutely different problem now. SY, Alex From owner-freebsd-performance@FreeBSD.ORG Wed Apr 23 07:09:32 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 24CF637B409; Wed, 23 Apr 2003 07:09:32 -0700 (PDT) Received: from HAL9000.homeunix.com (12-233-57-131.client.attbi.com [12.233.57.131]) by mx1.FreeBSD.org (Postfix) with ESMTP id BF01F43FEC; Wed, 23 Apr 2003 07:09:28 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Received: from HAL9000.homeunix.com (localhost [127.0.0.1]) by HAL9000.homeunix.com (8.12.9/8.12.5) with ESMTP id h3NE9RjC013429; Wed, 23 Apr 2003 07:09:27 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by HAL9000.homeunix.com (8.12.9/8.12.5/Submit) id h3NE9MfS013428; Wed, 23 Apr 2003 07:09:22 -0700 (PDT) (envelope-from das@FreeBSD.ORG) Date: Wed, 23 Apr 2003 07:09:22 -0700 From: David Schultz To: Narvi Message-ID: <20030423140922.GB13246@HAL9000.homeunix.com> Mail-Followup-To: Narvi , Alex Semenyaka , freebsd-performance@freebsd.org, freebsd-current@freebsd.org References: <20030420011039.GC52081@snark.ratmir.ru> <20030422023703.G29990-100000@haldjas.folklore.ee> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030422023703.G29990-100000@haldjas.folklore.ee> X-Mailman-Approved-At: Wed, 23 Apr 2003 07:35:56 -0700 cc: freebsd-performance@FreeBSD.ORG cc: freebsd-current@FreeBSD.ORG cc: Alex Semenyaka Subject: Re: /bin/sh and 32-bit arithmetics [CORRECTED] X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Apr 2003 14:09:32 -0000 On Tue, Apr 22, 2003, Narvi wrote: > > > On Sun, 20 Apr 2003, Alex Semenyaka wrote: > > > 7 5.20 6.37 3.51 22.50% i=$(($i<<1)) > > 8 5.25 6.42 3.51 22.27% i=$(($i<<$m)) > > > > As you can see, even for arithmetic-only script the overhead is not too big > > except with one case: shift operation. I decided to investigate is it usual > > script operation. I've went through all scripts I could find in my FreeBSD > > box. I've searched them with "locate .sh | grep '\.sh$'". There were a lot > > of them: > > > > $ locate .sh | grep '\.sh$' | wc -l > > 1637 > > > > But there was no any script that uses the shift operation. Good, but not > > enough. I've take the script that uses arithmetics and do some other job, > > ttfadmin.sh from the Abiword package. I've run in 10000 times in the loop > > with both (64-bit and 32-bit) shells. As an argument it received empty > > directory so no work has been done, just run, check pars, found no files, > > exit. It takes 65.35 seconds in the first case and 65.30 second in the second > > one. So the the time that arithmetics takes during the real script execution > > is too small in comparison to total running time (obviously: arithmetics > > is in-core calculations while any script usually run some external programs > > etc, and at least I/O is involved). > > Ahem - wouldn't it be easier to find out *why* the dramatic speed-down > happens and trying to combat it as opposed to trying to show the > speed-down is not releavant? There shouldn't be anything inherently that > much slower in 64 bit shifts... We're talking about interpreted Bourne shell here. It's slow by design, and 64-bit arithmetic is not going to make it significantly slower for anything other than microbenchmarks. BTW, I'll review the patches next month when I have some free time if nobody else jumps on it. From owner-freebsd-performance@FreeBSD.ORG Wed Apr 23 20:02:20 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DEF3437B401; Wed, 23 Apr 2003 20:02:20 -0700 (PDT) Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by mx1.FreeBSD.org (Postfix) with ESMTP id 204EF43FBD; Wed, 23 Apr 2003 20:02:20 -0700 (PDT) (envelope-from des@ofug.org) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 342475308; Thu, 24 Apr 2003 05:02:17 +0200 (CEST) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: Alex Semenyaka From: Dag-Erling Smorgrav Date: Thu, 24 Apr 2003 05:02:17 +0200 In-Reply-To: <20030420004639.GA52081@snark.ratmir.ru> (Alex Semenyaka's message of "Sun, 20 Apr 2003 04:46:39 +0400") Message-ID: User-Agent: Gnus/5.090015 (Oort Gnus v0.15) Emacs/21.2 References: <20030420004639.GA52081@snark.ratmir.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailman-Approved-At: Wed, 23 Apr 2003 20:10:49 -0700 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: freebsd-current@freebsd.org cc: freebsd-standards@freebsd.org Subject: Re: tjr@@freebsd.org, imp@freebsd.org, ru@freebsd.org X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Apr 2003 03:02:21 -0000 Alex Semenyaka writes: > Brief description what was done: I've chanched the arithmitics in the /bin/sh > from 32 bits to 64 bits. There are some doubts that it conforms to the > standards: it does, I have send a quotations to -standards, there were no > objections. Couple of people advuces me to use intmax_t and %jd - I've rewritten > the patch, now there is those species instead of long long and %qd. The last > question was performance, I will show the results of measurements below. Performance is irrelevant. Anyone who is doing so much arithmetic in the shell that performance is an issue should take a long hard look at dc(1). The only issues here are 1) correctness 2) portability (long long / %qd is not portable) and 3) standards compliance. You can safely ignore anyone trying to tell you otherwise. DES -- Dag-Erling Smorgrav - des@ofug.org