From owner-freebsd-smp Fri Oct 31 19:55:56 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id TAA26962 for smp-outgoing; Fri, 31 Oct 1997 19:55:56 -0800 (PST) (envelope-from owner-freebsd-smp) Received: from time.cdrom.com (root@time.cdrom.com [204.216.27.226]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id TAA26956 for ; Fri, 31 Oct 1997 19:55:52 -0800 (PST) (envelope-from jkh@time.cdrom.com) Received: from time.cdrom.com (jkh@localhost.cdrom.com [127.0.0.1]) by time.cdrom.com (8.8.7/8.6.9) with ESMTP id TAA26874 for ; Fri, 31 Oct 1997 19:55:52 -0800 (PST) To: smp@freebsd.org Subject: Some SMP timing tests. Date: Fri, 31 Oct 1997 19:55:52 -0800 Message-ID: <26870.878356552@time.cdrom.com> From: "Jordan K. Hubbard" Sender: owner-freebsd-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I'm not quite sure what to make of the following data, my suspicion being that I'd have to actually add a killer I/O system to my test machine in order to *truly* see the effects of SMP on the run time of a make world since things look pretty I/O bound here, but nonetheless, having spent 3 days collecting the data, I figured it was worth at least posting a quick message about here. :-) Test machine was a dual P6/200 with Tyan 1668 motherboard, 64MB of memory, an Adaptec 2940UW controller and IBM DCAS-34330W 4.3GB 5400 RPM drive. Source tree used for testing was from 3.0-971029-SNAP. Two identical kernels were prepared, one with SMP support and one without, for each run a "throw away" make world being done first before timing a series of make -j worlds, with n going from 1 to 20. Each run started from a fresh reboot, no other activity going on during the time of the runs. The most interesting thing about these numbers was that at "high job counts", where one would expect performance to start to actually degrade due to having too many compiles competing for various system resources, performance did not fall as expected. This leads me to believe that our make actually artificially limits the parallelism number to somewhere below 20. I haven't bothered to look into make's code more thoroughly in verifying this, but that's certainly what it looks like. Anyway, here's the graph in postscript: %!PS-Adobe-2.0 %%Creator: gnuplot %%DocumentFonts: Helvetica %%BoundingBox: 50 50 554 770 %%Pages: (atend) %%EndComments /gnudict 40 dict def gnudict begin /Color false def /Solid false def /gnulinewidth 5.000 def /vshift -46 def /dl {10 mul} def /hpt 31.5 def /vpt 31.5 def /M {moveto} bind def /L {lineto} bind def /R {rmoveto} bind def /V {rlineto} bind def /vpt2 vpt 2 mul def /hpt2 hpt 2 mul def /Lshow { currentpoint stroke M 0 vshift R show } def /Rshow { currentpoint stroke M dup stringwidth pop neg vshift R show } def /Cshow { currentpoint stroke M dup stringwidth pop -2 div vshift R show } def /DL { Color {setrgbcolor Solid {pop []} if 0 setdash } {pop pop pop Solid {pop []} if 0 setdash} ifelse } def /BL { stroke gnulinewidth 2 mul setlinewidth } def /AL { stroke gnulinewidth 2 div setlinewidth } def /PL { stroke gnulinewidth setlinewidth } def /LTb { BL [] 0 0 0 DL } def /LTa { AL [1 dl 2 dl] 0 setdash 0 0 0 setrgbcolor } def /LT0 { PL [] 0 1 0 DL } def /LT1 { PL [4 dl 2 dl] 0 0 1 DL } def /LT2 { PL [2 dl 3 dl] 1 0 0 DL } def /LT3 { PL [1 dl 1.5 dl] 1 0 1 DL } def /LT4 { PL [5 dl 2 dl 1 dl 2 dl] 0 1 1 DL } def /LT5 { PL [4 dl 3 dl 1 dl 3 dl] 1 1 0 DL } def /LT6 { PL [2 dl 2 dl 2 dl 4 dl] 0 0 0 DL } def /LT7 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 1 0.3 0 DL } def /LT8 { PL [2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 2 dl 4 dl] 0.5 0.5 0.5 DL } def /P { stroke [] 0 setdash currentlinewidth 2 div sub M 0 currentlinewidth V stroke } def /D { stroke [] 0 setdash 2 copy vpt add M hpt neg vpt neg V hpt vpt neg V hpt vpt V hpt neg vpt V closepath stroke P } def /A { stroke [] 0 setdash vpt sub M 0 vpt2 V currentpoint stroke M hpt neg vpt neg R hpt2 0 V stroke } def /B { stroke [] 0 setdash 2 copy exch hpt sub exch vpt add M 0 vpt2 neg V hpt2 0 V 0 vpt2 V hpt2 neg 0 V closepath stroke P } def /C { stroke [] 0 setdash exch hpt sub exch vpt add M hpt2 vpt2 neg V currentpoint stroke M hpt2 neg 0 R hpt2 vpt2 V stroke } def /T { stroke [] 0 setdash 2 copy vpt 1.12 mul add M hpt neg vpt -1.62 mul V hpt 2 mul 0 V hpt neg vpt 1.62 mul V closepath stroke P } def /S { 2 copy A C} def end % Define the array ISOLatin1Encoding (which specifies how characters are % encoded for ISO-8859-1 fonts), if it isn't already present (Postscript % level 2 is supposed to define it, but level 1 doesn't). systemdict /ISOLatin1Encoding known not { /ISOLatin1Encoding [ /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /exclam /quotedbl /numbersign /dollar /percent /ampersand /quoteright /parenleft /parenright /asterisk /plus /comma /minus /period /slash /zero /one /two /three /four /five /six /seven /eight /nine /colon /semicolon /less /equal /greater /question /at /A /B /C /D /E /F /G /H /I /J /K /L /M /N /O /P /Q /R /S /T /U /V /W /X /Y /Z /bracketleft /backslash /bracketright /asciicircum /underscore /quoteleft /a /b /c /d /e /f /g /h /i /j /k /l /m /n /o /p /q /r /s /t /u /v /w /x /y /z /braceleft /bar /braceright /asciitilde /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /space /dotlessi /grave /acute /circumflex /tilde /macron /breve /dotaccent /dieresis /space /ring /cedilla /space /hungarumlaut /ogonek /caron /space /exclamdown /cent /sterling /currency /yen /brokenbar /section /dieresis /copyright /ordfeminine /guillemotleft /logicalnot /hyphen /registered /macron /degree /plusminus /twosuperior /threesuperior /acute /mu /paragraph /periodcentered /cedillar /onesuperior /ordmasculine /guillemotright /onequarter /onehalf /threequarters /questiondown /Agrave /Aacute /Acircumflex /Atilde /Adieresis /Aring /AE /Ccedilla /Egrave /Eacute /Ecircumflex /Edieresis /Igrave /Iacute /Icircumflex /Idieresis /Eth /Ntilde /Ograve /Oacute /Ocircumflex /Otilde /Odieresis /multiply /Oslash /Ugrave /Uacute /Ucircumflex /Udieresis /Yacute /Thorn /germandbls /agrave /aacute /acircumflex /atilde /adieresis /aring /ae /ccedilla /egrave /eacute /ecircumflex /edieresis /igrave /iacute /icircumflex /idieresis /eth /ntilde /ograve /oacute /ocircumflex /otilde /odieresis /divide /oslash /ugrave /uacute /ucircumflex /udieresis /yacute /thorn /ydieresis ] def } if % Override the setfont procedure with a new procedure that re-encodes % the font to use the ISO Latin-1 style. The body of this procedure % comes from Section 5.6.1 of the Postscript book. /realsetfont /setfont load def /setfont { dup length dict begin {1 index /FID ne {def} {pop pop} ifelse} forall /Encoding ISOLatin1Encoding def currentdict end /Temporary exch definefont realsetfont } bind def %%EndProlog %%Page: 1 1 gnudict begin gsave 50 50 translate 0.100 0.100 scale 90 rotate 0 -5040 translate 0 setgray /Helvetica findfont 140 scalefont setfont newpath LTa LTb 840 351 M 63 0 V 6066 0 R -63 0 V 756 351 M (60) Rshow 840 864 M 63 0 V 6066 0 R -63 0 V 756 864 M (80) Rshow 840 1377 M 63 0 V 6066 0 R -63 0 V -6150 0 R (100) Rshow 840 1890 M 63 0 V 6066 0 R -63 0 V -6150 0 R (120) Rshow 840 2403 M 63 0 V 6066 0 R -63 0 V -6150 0 R (140) Rshow 840 2917 M 63 0 V 6066 0 R -63 0 V -6150 0 R (160) Rshow 840 3430 M 63 0 V 6066 0 R -63 0 V -6150 0 R (180) Rshow 840 3943 M 63 0 V 6066 0 R -63 0 V -6150 0 R (200) Rshow 840 4456 M 63 0 V 6066 0 R -63 0 V -6150 0 R (220) Rshow 840 4969 M 63 0 V 6066 0 R -63 0 V -6150 0 R (240) Rshow 1163 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (2) Cshow 1808 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (4) Cshow 2453 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (6) Cshow 3098 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (8) Cshow 3743 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (10) Cshow 4388 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (12) Cshow 5034 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (14) Cshow 5679 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (16) Cshow 6324 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (18) Cshow 6969 351 M 0 63 V 0 4555 R 0 -63 V 0 -4695 R (20) Cshow 840 351 M 6129 0 V 0 4618 V -6129 0 V 840 351 L 140 2660 M currentpoint gsave translate 90 rotate 0 0 M (Time in minutes) Cshow grestore 3904 71 M (make -j value) Cshow LT0 6486 4766 M (Single P6/200) Rshow 6570 4766 M 252 0 V 840 3301 M 323 -718 V 322 -103 V 323 -51 V 322 0 V 323 -26 V 322 0 V 323 26 V 323 0 V 322 -26 V 323 0 V 322 0 V 323 0 V 323 26 V 322 -26 V 323 26 V 322 0 V 323 -26 V 322 77 V 323 26 V LT1 6486 4626 M (Dual P6/200) Rshow 6570 4626 M 252 0 V 840 3584 M 1163 2044 L 322 -154 V 323 -51 V 322 0 V 323 -26 V 322 0 V 323 77 V 323 -51 V 322 -51 V 323 51 V 322 0 V 323 0 V 323 -26 V 322 0 V 323 0 V 322 0 V 323 26 V 322 -26 V 323 0 V stroke grestore end showpage