Date: Wed, 26 Nov 2003 11:50:37 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Colin Percival <colin.percival@wadham.ox.ac.uk> Cc: freebsd-current@freebsd.org Subject: Re: 40% slowdown with dynamic /bin/sh Message-ID: <200311261950.hAQJobLV089730@apollo.backplane.com> References: <00a701c3b33c$f798c5e0$b9844051@insultant.net> <200311251214.23290.doconnor@gsoft.com.au> <5.0.2.1.1.20031126054607.02e7f148@popserver.sfu.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
:At 00:23 26/11/2003 -0500, Michael Edenfield wrote: :>Static /bin/sh: :> real 385m29.977s :> user 111m58.508s :> sys 93m14.450s :> :>Dynamic /bin/sh: :> real 455m44.852s :> user 113m17.807s :> sys 103m16.509s : : Given that user+sys << real in both cases, it looks like you're running :out of memory; it's not surprising that dynamic linking has an increased :cost in such circumstances, since reading the diverse files into memory :will take longer than reading a single static binary. : I doubt many systems will experience this sort of performance delta. : :Colin Percival It definitely looks memory related but the system isn't necessarily 'running out' of memory. It could simply be that the less memory available for caching files is causing more disk I/O to occur. It should be possible to quanity this by doing a full timing of the build ( /usr/bin/time -l ), which theoretically includes I/O ops. Dynamically linked code definitely dirties more anonymous memory then static, and definitely accesses more shared file pages. The difference is going to depend on the complexity of the program. How much this effects system peformance depends on the situation. If the system has significant idle cycles available the impact should not be too serious, but if it doesn't then the overhead will drag down the pre-zerod pages (even if the program is exec'd, does something real quick, and exits). I have included a program below that prints the delta free page count and the delta zero-fill count once a second. This can be used to estimate anonymous memory use. Run the program and let it stabilize. Be sure that the system is idle. Then run the target program (it needs to stick around, it can't just exec and exit), then exit the target program and repeat. Leave several seconds in between invocation, exit, and repeat to allow the system to stabilize. Note that it may take several runs to get reliable information since the program is measuring anonymous memory use for the whole system. Also note that shared pages will not be measured by this program, only the number of dirtied anonymous pages. If on an idle system the program is not reporting '0 0' then your system isn't idle :-). The main indicator is the 'freepg' negative jump when the target program is invoked. The zfod count will be a subset of that, indicating the number of zero-fill pages requested (verses program text/data COW pages which do not need zero'd pages but still eat anonymous memory for the duration of the target program). When I tested it with a static and dynamic /bin/sh on 4.8 I got (looking at 'freepg'), 20 pages for the static binary and 50 pages for the dynamic binary. So a dynamic /bin/sh eats 30 * 4K = 120K more anonymous memory then a static /bin/sh. In the same test I got 12 ZFOD faults for the static binary and 34 ZFOD faults for the dynamic binary, which means that 22 additional pre-zero'd pages are being allocated in the dynamic case (88KB). If /bin/sh is exec'd a lot in a situation where the system is otherwise not idle, this will impact the number of pre-zero'd pages available on the system. Each exec of a dyanmic /bin/sh eats 22 additional pages (88K) worth of zero-fill. Each resident copy of (exec'd) /bin/sh eats 120KB more dirty anonymous memory. make buildworld -j 1 may have as many as a dozen /bin/sh's exec'd at any given moment (impact 120K each) depending on where in the build it is. -j 2 and so forth will have even more. This will impact your system relative to the amount of total system memory you have. The more system memory you have, the less the percentage impact. /bin/sh /bin/csh -------------- ----------------------- static freepg -19 zfod 12 freepg -140 zfod 129 dynamic freepg -50 zfod 34 freepg -167 zfod 149 /usr/bin/make (note that make is static by default) -------------- static freepg -33 zfod 27 dynamic freepg -51 zfod 44 As you can see, the issue becomes less significant on a percentage basis with larger programs that already allocate more incidental memory. Also to my surprise I found that 'make' was already static. It would seem that this issue was recognized long ago. bzip2, chflags, make, and objformat are compiled statically even though they reside in /usr/bin. -Matt /* * print delta free pages and zfod requests once a second. Leave running * while testing other programs. Note: ozfod is not displayed. ozfod is * a subset of zfod, just as zfod deltas are a subset of v_free_count * allocations. */ #include <sys/types.h> #include <sys/sysctl.h> #include <stdio.h> #include <unistd.h> int main(int ac, char **av) { int fc1; int zfod1; int fc2; int zfod2; size_t fclen; fclen = sizeof(fc1); sysctlbyname("vm.stats.vm.v_free_count", &fc1, &fclen, NULL, 0); fclen = sizeof(zfod1); sysctlbyname("vm.stats.vm.v_zfod", &zfod1, &fclen, NULL, 0); for (;;) { fclen = sizeof(fc1); sysctlbyname("vm.stats.vm.v_free_count", &fc2, &fclen, NULL, 0); fclen = sizeof(zfod2); sysctlbyname("vm.stats.vm.v_zfod", &zfod2, &fclen, NULL, 0); printf("freepg %-4d zfod %-4d\n", fc2 - fc1, zfod2 - zfod1); sleep(1); fc1 = fc2; zfod1 = zfod2; } return(0); }
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200311261950.hAQJobLV089730>