Date: Mon, 15 Apr 1996 12:34:47 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: nash@mcs.com Cc: hackers@FreeBSD.ORG Subject: Re: Unices are created equal, but... Message-ID: <199604151934.MAA09356@phaeton.artisoft.com> In-Reply-To: <199604142039.PAA04761@zen.nash.org> from "Alex Nash" at Apr 14, 96 03:39:52 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> The execl throughput test was a complete massacre, with Linux more > than an order of magnitude faster. Does anyone familiar with the > internals of exec know why? The magic selection path has been deoptimized from the design. This accounts for about 20% of the difference. Significant wins can be had by marking the environment pages copy on write and simply pointing to them. Part of this has to do with the insistence that we continue to allow manual manipulation of the environment. In reality, the environment manipulation we do is pretty screwed for reasons of backward compatability. The allocation phase alone costs ~10%, with a total of ~30% for environement crap. Part of the Linux strategy is prefork allocation. That is, the overhead is hidden, not eliminated, but putting it elsewhere. We could do the same thing, but I would not recommend the same implementation. Linux has tarditionally suffered mmap() problems, and I have traced at least one to the way pages are marked on fork. Finally, the version of Linux used does not have the shared image startup code in crt0.o. The amount of time from "start" to "first output" is much smaller. This is fixable by going to ELF -- John Polstra has details. In this particular case, the Linux ELF implementation is "broken" (quoted for SEF's political sensibilities 8-)) and can not use the module loader with a statically mapped ld.so (which could save some significant load time for shared executables). So, currently, this is a win available only to BSD (but not exercised). I haven't bothered with the environment crap, since I believe that all environment manipulation should go through shared library calls, which in BSD should be convereted to system calls to do logical name table manipulation, and fix the environment cruft once and for all. Personally, I'm not very concerned at all about the times reported. Put in the new pipe changes that Bruce did and up the pipe buffer size to 8k to see significant improvement in what is reported as "exec" times by this benchmark. Byte benchmarks are useless. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604151934.MAA09356>