Date: Mon, 7 Oct 2013 10:32:57 -0700 From: Adrian Chadd <adrian@freebsd.org> To: performance@freebsd.org Subject: Re: Apparent performance regression 8.3@ -> 8.4@r255966? Message-ID: <CAJ-Vmom2vPcRPpSwYkAzsvztRxx7WEgpYjXPdp%2Bfgn43iexkmA@mail.gmail.com> In-Reply-To: <20131007172804.GA7641@albert.catwhisker.org> References: <20131007172804.GA7641@albert.catwhisker.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Oct 7, 2013 1:28 PM, "David Wolfskill" <david@catwhisker.org> wrote: > > At work, we have a bunch of machines that developers use to build some > software. The machines presently run FreeBSD/amd64 8.3-STABLE @rxxxxxx > (with a few local patches, which have since been committed to stable/8), > and the software is built within a 32-bit jail. Well, whats the start commit? Xxxxxx isnt all that helpful. :-) Once you can establish that, we xan go over the commit logs to see what changed. Youre building exactly the same software on both, right? -adrian > The hardware includes 2 packages of 6 physical cores each @3.47GHz > (Intel X5690); SMT is enabled (so the scheduler sees hw.ncpu == > 24). The memory on the machines was recently increased from 6GB > to 96GB. > > I am trying to set up a replacement host environment on my test machine; > the current environment there is FreeBSD/amd64 8.4-STABLE @r255966; this > environment achieves a couple of objectives: > > * It has no local patches. > * The known problems (e.g., with mfiutil failing to report battery > status accurately) are believed to be addressed appropriately. > > However: when I do comparison software builds, the new environment is > taking about 12% longer to perform the same work (comparing against a > fair sample of the deployed machines): > > > Now, when I do these builds, I do so under /usr/bin/time, as well > as using a bit of "scaffolding" I cobbled up (a few years back) > that basically samples a bunch of sysctl OIDs periodically (by > default, every 10 seconds). Once the build is done, I grab the > file that has the sampled OID data and bring it to my desktop machine > to post-process it; I generate graphs showing (aggregate and per-core) > CPU utilization, as well as Load Averages over the course of the > build. I can also generate graphs that show how the memory statistics > that "top" displays vary during the course of the build, as well as just > about any univariate OID, and quite a few simple multivariate OIDs > (e.g., kern.cp_time, kern.cp_times, and vm.loadavg). > > After seeing the above results and poking around looking for > somewhat-recent tuning information, I ran across a suggestion that the > default of 2MB for vfs.ufs.dirhash_maxmem was probably on the low side. > So I started sampling both vfs.ufs.dirhash_maxmem (mostly to make > documentation of the configuration for a test run easier) and > vfs.ufs.dirhash_mem (to see what we were actually using). And I tried > quadrupling vfs.ufs.dirhash_maxmem (to 8MB). > > The next time I tried a test build, I found that vfs.ufs.dirhash_mem > started at about 3.8MB, climbed fairly steadily, then "clipped" at > 8MB, so I quadrupled it again (to 32MB), and found that it climbed > to almost 12MB, then dropped precipitously to about 400KB (and > oscillated between about 400KB & 20MB for the rest of the build, > which appears to be the "packaging" phase). > > Despite that increase in vfs.ufs.dirhash_maxmem, this does not > appear to have measurably affected the build times. > > In examining the CPU utilization graphs, the CPU generally looks > about 5% busy for the first 15 minutes; this would be bmake determining > dependency graphs, I expect. For the next 1:20, CPU is about 70% > busy (~15% system; ~65% user/nice) for about 20 minutes, then drops > to about 45% busy (~25% system; ~20% user/nice) for the next 20 > minutes, and that pattern repeats once. > > We then see overall CPU use climb to about 60% (~20% system; ~40% > user/nice) for about 1:20. > > Then there's a period of about 2:00 where overall CPU is at about 40% > (~30% system; ~10% user/nice). > > Based on earlier work I did, where I was able to do a similar build in a > native FreeBSD/i386 (no PAE) enviroment on the same hardware (but when > it still only had 6GB RAM), and I managed to get the build done in 2:47, > I believe that getting more work done in parallel in this 2:00 period is > a key to improving performance: the 2:47 result showed that period to be > a very busy one for the CPU. > > But I am at a loss to understand what might be preventing the work form > getting done (in a timely fashion). > > I believe that there were some commits made to stable/9 (MFCed from > head) a few months ago to significantly reduce the overhead of using > jails or using nullfs (or both). And I'm looking forward to being able > to test that -- but I need to get a "fixed" 8.x environment deployed > first, and a 12% increase in build times is not something that is likely > to be well-received. > > Help? > > Peace, > david > -- > David H. Wolfskill david@catwhisker.org > Taliban: Evil cowards with guns afraid of truth from a 14-year old girl. > > See http://www.catwhisker.org/~david/publickey.gpg for my public key.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmom2vPcRPpSwYkAzsvztRxx7WEgpYjXPdp%2Bfgn43iexkmA>