From owner-cvs-all@FreeBSD.ORG  Fri Jun  8 14:50:26 2007
Return-Path: <owner-cvs-all@FreeBSD.ORG>
X-Original-To: cvs-all@freebsd.org
Delivered-To: cvs-all@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id E818316A400;
	Fri,  8 Jun 2007 14:50:25 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au
	[211.29.132.187])
	by mx1.freebsd.org (Postfix) with ESMTP id 7E43613C457;
	Fri,  8 Jun 2007 14:50:25 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from besplex.bde.org (c220-239-235-248.carlnfd3.nsw.optusnet.com.au
	[220.239.235.248])
	by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	l58EoCWV018681
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 9 Jun 2007 00:50:15 +1000
Date: Sat, 9 Jun 2007 00:50:15 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: John Baldwin <jhb@freebsd.org>
In-Reply-To: <200706071059.41466.jhb@freebsd.org>
Message-ID: <20070609002752.M13082@besplex.bde.org>
References: <200706051420.l55EKEih018925@repoman.freebsd.org>
	<20070607163724.M7517@besplex.bde.org>
	<20070607180257.P7767@besplex.bde.org>
	<200706071059.41466.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: src-committers@freebsd.org, Kip Macy <kip.macy@gmail.com>,
	cvs-all@freebsd.org, Attilio Rao <attilio@freebsd.org>,
	Bruce Evans <brde@optusnet.com.au>, cvs-src@freebsd.org,
	Kostik Belousov <kostikbel@gmail.com>,
	Jeff Roberson <jroberson@chesapeake.net>
Subject: Re: cvs commit: src/sys/kern kern_mutex.c
X-BeenThere: cvs-all@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: CVS commit messages for the entire tree <cvs-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-all>
List-Post: <mailto:cvs-all@freebsd.org>
List-Help: <mailto:cvs-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-all>,
	<mailto:cvs-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Jun 2007 14:50:26 -0000

On Thu, 7 Jun 2007, John Baldwin wrote:

> On Thursday 07 June 2007 04:15:13 am Bruce Evans wrote:
>>> The next run will have pagezero resetting its priority when this priority
>>> gets clobbered.
>>
>> That gave only mainly more voluntary context switches (13.5+ million instead
>> of the best observed value of 1.3+ million or the value of 2.9+ million
>> without priority resetting.  It reduced the pagezero time from 30 seconds to
>> 24.  It didn't change the real time significantly.
>
> Hmm, one problem with the non-preemption page zero is that it doesn't yield
> the lock when it voluntarily yields.  I wonder if something like this patch
> would help things for the non-preemption case:

This works well.  It fixed the extra 1.4 million voluntary context switches
and even reduced the number by 10-100k.  The real runtime is still up a bit,
but user+sys+pgzero time is down a bit.

> Index: vm_zeroidle.c
> ===================================================================
> RCS file: /usr/cvs/src/sys/vm/vm_zeroidle.c,v
> retrieving revision 1.45
> diff -u -r1.45 vm_zeroidle.c
> --- vm_zeroidle.c       18 May 2007 07:10:50 -0000      1.45
> +++ vm_zeroidle.c       7 Jun 2007 14:56:02 -0000
> @@ -147,8 +147,10 @@
> #ifndef PREEMPTION
>                        if (sched_runnable()) {
>                                mtx_lock_spin(&sched_lock);
> +                               mtx_unlock(&vm_page_queue_free_mtx);
>                                mi_switch(SW_VOL, NULL);
>                                mtx_unlock_spin(&sched_lock);
> +                               mtx_lock(&vm_page_queue_free_mtx);
>                        }
> #endif
>                } else {

The sched_locks are now of course thread_locks.  I put the vm unlock
before the thread lock since the above order seems to risk a LOR.  That
may have been a mistake -- we would prefer not to be switched after
deciding to do it ourself.

> We could simulate this behavior some by using a critical section to control
> when preemptions happen so that we wait until we drop the lock perhaps to
> allow preemptions.  Something like this:

Simulate?  Do you mean simulate the revious pbehaviour?

I think the voluntary context switch in the above only worked well
because it rarely happened.  Apparently, switches away from pagezero
normally happened due to the previous behaviour when the vm lock is
released.

>> Index: vm_zeroidle.c
> ===================================================================
> RCS file: /usr/cvs/src/sys/vm/vm_zeroidle.c,v
> retrieving revision 1.45
> diff -u -r1.45 vm_zeroidle.c
> --- vm_zeroidle.c       18 May 2007 07:10:50 -0000      1.45
> +++ vm_zeroidle.c       7 Jun 2007 14:58:39 -0000
> @@ -110,8 +110,10 @@
>        if (m != NULL && (m->flags & PG_ZERO) == 0) {
>                vm_pageq_remove_nowakeup(m);
>                mtx_unlock(&vm_page_queue_free_mtx);
> +               critical_exit();
>                pmap_zero_page_idle(m);
>                mtx_lock(&vm_page_queue_free_mtx);
> +               critical_enter();
>                m->flags |= PG_ZERO;
>                vm_pageq_enqueue(PQ_FREE + m->pc, m);
>                ++vm_page_zero_count;

Next I will try this.  I put the critical_exit() before the vm unlock.
mtx_unlock() should be allowed to switch if it wants.  However, we
would prefer to keep context switches disabled in the above -- just drop
the lock so that other CPUs can proceed.

> @@ -141,20 +143,25 @@
>        idlezero_enable = idlezero_enable_default;
>
>        mtx_lock(&vm_page_queue_free_mtx);
> +       critical_enter();
>        for (;;) {
>                if (vm_page_zero_check()) {
>                        vm_page_zero_idle();
> #ifndef PREEMPTION
>                        if (sched_runnable()) {
>                                mtx_lock_spin(&sched_lock);
> +                               mtx_unlock(&vm_page_queue_free_mtx);
>                                mi_switch(SW_VOL, NULL);
>                                mtx_unlock_spin(&sched_lock);
> +                               mtx_lock(&vm_page_queue_free_mtx);
>                        }
> #endif

I added the necessary critical_exit/enter() calls here.

>                } else {
> +                       critical_exit();
>                        wakeup_needed = TRUE;
>                        msleep(&zero_state, &vm_page_queue_free_mtx, 0,
>                            "pgzero", hz * 300);
> +                       critical_enter();
>                }
>        }
> }

Bruce