From owner-freebsd-arch@FreeBSD.ORG Thu Nov 23 04:06:02 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from localhost.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id 24B1016A403; Thu, 23 Nov 2006 04:06:01 +0000 (UTC) (envelope-from davidxu@freebsd.org) From: David Xu To: freebsd-arch@freebsd.org Date: Thu, 23 Nov 2006 12:05:55 +0800 User-Agent: KMail/1.8.2 References: <45649E42.70409@cs.rice.edu> <20061123124725.P35210@delplex.bde.org> In-Reply-To: <20061123124725.P35210@delplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200611231205.55629.davidxu@freebsd.org> Cc: Kip Macy , Alan Cox , arch@freebsd.org Subject: Re: superpage plans X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Nov 2006 04:06:02 -0000 On Thursday 23 November 2006 11:25, Bruce Evans wrote: > On Wed, 22 Nov 2006, Alan Cox wrote: > > There is only one caveat. Idle-time page prezeroing is not supported. > > However, ever since the VM system emerged from the Giant kernel lock, > > I've seen little or no benefit from it. ... > > That's probably because PREEMPTION is broken and the brokenness turns > idle-time page prezeroing into a pessimization. Without PREEMPTION I > see much the same benefits from idle-time page prezeroing as in RELENG_4 > -- a speedup of a few percent for makeworld. E.g., for makeworld -j4 > of a RELENG_4 with a -current i386 SMP kernel and a ~5.2 userland on > a Turion X2 2GHz: > > %%% > vm.idlezero_enable=0: > -------------------------------------------------------------- > > >>> elf make world completed on Thu Nov 23 12:01:56 EST 2006 > > (started Thu Nov 23 11:50:59 EST 2006) > -------------------------------------------------------------- > 656.36 real 815.59 user 194.92 sys > 23572 maximum resident set size > 1164 average shared memory size > 1212 average unshared data size > 128 average unshared stack size > 14202993 page reclaims > 6911 page faults > 0 swaps > 14686 block input operations > 4647 block output operations > 77645 messages sent > 0 messages received > 35459 signals received > 838638 voluntary context switches > 391631 involuntary context switches > > vm.idlezero_enable=1: > -------------------------------------------------------------- > > >>> elf make world completed on Thu Nov 23 12:35:54 EST 2006 > > (started Thu Nov 23 12:25:07 EST 2006) > -------------------------------------------------------------- > 647.19 real 814.16 user 185.69 sys > 23572 maximum resident set size > 1168 average shared memory size > 1220 average unshared data size > 128 average unshared stack size > 14202807 page reclaims > 6958 page faults > 0 swaps > 14534 block input operations > 4689 block output operations > 77466 messages sent > 0 messages received > 35456 signals received > 847575 voluntary context switches > 397783 involuntary context switches > %%% > > With idlezero enabled and PREEMPTION not enabled in the above, pgzero > runs in actual idle time for 14 seconds and reduces both the real and > sys times by 9 seconds (1.5% of real time and 5% of system time). > > With idlezero enabled and PREEMPTION enabled (details not shown), > PREEMPTION doesn't actually work but pgzero depends on it working, so > pgzero runs for much longer than in the above, with all the extra time > stolen from non-idle time. In my makeworld benchmarks, this gives > total benefits that are negative and about the same magnitude as the > postive ones without PREEMPTION. PREEMPTION gives some other negative > benefits for makeworld, but the others are smaller, at least without > any userland idle priority threads that want to run all the time. > > The system for the above tests has a fairly large write bandwidtth > (5GB/sec for movnt*) so it benefits from idle-time page prezeroing > less than most systems. I've seen it taking and saving 3% of the time > for makeworld (60 seconds out of 1800) on UP systems with similar CPU > speeds but slower memory and pagezero() not optimized to use movnt*. > UP systems benefit less than SMP ones since they have a lower percentage > of idle time. > > Of course the possible savings are less if the system is less often > idle, but makeworld -j4 on an SMP system leaves a lot of time idle, > especially when it runs mkdep and perl serially. I don't know if > pgzero is running mainly in bursts in the time left idle by mkdep in > the above, but guess not since it limits itself to not zeroing very > many pages to avoid thrashing caches. Perhaps it should not limit > itself so much when the zeroing is nontemporal. > > Bruce > _______________________________________________ I think on SMP, the BSD scheduler code does not preempt an idle thread in remote cpu, this might explain the reason why page_zero uses much time on SMP. the maybe_preempt() only works for current cpu. David Xu