From owner-freebsd-current@FreeBSD.ORG  Tue Nov  6 01:28:32 2012
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 157EB5BE;
 Tue,  6 Nov 2012 01:28:32 +0000 (UTC)
 (envelope-from davidxu@freebsd.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id EFA6A8FC08;
 Tue,  6 Nov 2012 01:28:31 +0000 (UTC)
Received: from xyf.my.dom (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id qA61SUmu067269;
 Tue, 6 Nov 2012 01:28:31 GMT (envelope-from davidxu@freebsd.org)
Message-ID: <509867C5.6030109@freebsd.org>
Date: Tue, 06 Nov 2012 09:28:37 +0800
From: David Xu <davidxu@freebsd.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD i386;
 rv:14.0) Gecko/20120822 Thunderbird/14.0
MIME-Version: 1.0
To: Andriy Gapon <avg@freebsd.org>
Subject: Re: ULE patch, call for testers
References: <alpine.BSF.2.00.1211020822260.1947@desktop>
 <50972740.7000703@freebsd.org> <50978353.7090204@FreeBSD.org>
In-Reply-To: <50978353.7090204@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Jeff Roberson <jroberson@jroberson.net>, current@freebsd.org
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Nov 2012 01:28:32 -0000

On 2012/11/05 17:13, Andriy Gapon wrote:
> on 05/11/2012 04:41 David Xu said the following:
>> Another problem I remembered is that a thread on runqueue may be starved
>> because ULE treats a sleeping thread and a thread waiting on runqueue
>> differently. If a thread has slept for a while, after it is woken up,
>> its priority is boosted, but for a thread on runqueue, its priority
>> will never be boosted. In essential, they should be same becase both of
>> them are waiting for cpu. If I am a thread, I'd like to wait on sleep
>> queue rather than on runqueue, since in former case, I will get
>> bonus, while in later case, I'll get nothing. Under heavy load,
>> there are many runnable threads, this unfair can cause a very low priority
>> thread on runqueue to be starved. 4BSD seems not suffer from
>> this problem, because it also decay cpu time of thread on runqueue.
>> I think ULE needs some anti-starvation code to give thread a shot
>> if it is waiting on runqueue too long time.
>
> I also noticed this issue and I've been playing with the following patch.
> Two points:
>   o I am not sure if it is ideologically correct
>   o it didn't improve much the behavior of my workloads
> In any case, here it is:
>
>      - extend accounted interactive sleep time to a point where a thread runs
>        (as opposed to be added to runq)
>
> --- a/sys/kern/sched_ule.c
> +++ b/sys/kern/sched_ule.c
> @@ -1898,8 +1899,21 @@ sched_switch(struct thread *td, struct thread *newtd, int
> flags)
>   		SDT_PROBE2(sched, , , off_cpu, td, td->td_proc);
>   		lock_profile_release_lock(&TDQ_LOCKPTR(tdq)->lock_object);
>   		TDQ_LOCKPTR(tdq)->mtx_lock = (uintptr_t)newtd;
> +#if 1
> +		/*
> +		 * If we slept for more than a tick update our interactivity and
> +		 * priority.
> +		 */
> +		int slptick;
> +		slptick = newtd->td_slptick;
> +		newtd->td_slptick = 0;
> +		if (slptick && slptick != ticks) {
> +			newtd->td_sched->ts_slptime +=
> +			    (ticks - slptick) << SCHED_TICK_SHIFT;
> +			sched_interact_update(newtd);
> +		}
> +#endif
>   		sched_pctcpu_update(newtd->td_sched, 0);
> -
>   #ifdef KDTRACE_HOOKS
>   		/*
>   		 * If DTrace has set the active vtime enum to anything
> @@ -1990,6 +2004,7 @@ sched_wakeup(struct thread *td)
>   	THREAD_LOCK_ASSERT(td, MA_OWNED);
>   	ts = td->td_sched;
>   	td->td_flags &= ~TDF_CANSWAP;
> +#if 0
>   	/*
>   	 * If we slept for more than a tick update our interactivity and
>   	 * priority.
> @@ -2001,6 +2016,7 @@ sched_wakeup(struct thread *td)
>   		sched_interact_update(td);
>   		sched_pctcpu_update(ts, 0);
>   	}
> +#endif
>   	/* Reset the slice value after we sleep. */
>   	ts->ts_slice = sched_slice;
>   	sched_add(td, SRQ_BORING);
>
>

What I want is fairness between waiting on runqueue and waiting on
sleepqueue. Supports you have N threads on runqueue:

T1,T2,T3...Tn.

and a thread T(n+1) on sleepqueue.

If CPU runs threads T1...Tn in round-robin fashion, and suppose at
time n, the thread Tn is run, this means total time of n-1 is passed,
and at the time, thread T(n+1) is woken up, and scheduler's
sched_interact_score() will give it higher priority over Tn, this is 
unfair because both threads have spent same total time to waiting for 
cpu. Do your patch fix the problem ?

Regards,
David Xu