From owner-svn-src-head@FreeBSD.ORG Thu Aug 9 19:26:13 2012 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DE010106566B; Thu, 9 Aug 2012 19:26:13 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id C92F18FC08; Thu, 9 Aug 2012 19:26:13 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id q79JQD4o044018; Thu, 9 Aug 2012 19:26:13 GMT (envelope-from mav@svn.freebsd.org) Received: (from mav@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id q79JQDSM044014; Thu, 9 Aug 2012 19:26:13 GMT (envelope-from mav@svn.freebsd.org) Message-Id: <201208091926.q79JQDSM044014@svn.freebsd.org> From: Alexander Motin Date: Thu, 9 Aug 2012 19:26:13 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r239157 - head/sys/kern X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Aug 2012 19:26:14 -0000 Author: mav Date: Thu Aug 9 19:26:13 2012 New Revision: 239157 URL: http://svn.freebsd.org/changeset/base/239157 Log: Rework r220198 change (by fabient). I believe it solves the problem from the wrong direction. Before it, if preemption and end of time slice happen same time, thread was put to the head of the queue as for only preemption. It could cause single thread to run for indefinitely long time. r220198 handles it by not clearing TDF_NEEDRESCHED in case of preemption. But that causes delayed context switch every time preemption happens, even when not needed. Solve problem by introducing scheduler-specifoc thread flag TDF_SLICEEND, set when thread's time slice is over and it should be put to the tail of queue. Using SW_PREEMPT flag for that purpose as it was before just not enough informative to work correctly. On my tests this by 2-3 times reduces run time deviation (improves fairness) in cases when several threads share one CPU. Reviewed by: fabient MFC after: 2 months Sponsored by: iXsystems, Inc. Modified: head/sys/kern/sched_4bsd.c head/sys/kern/sched_ule.c Modified: head/sys/kern/sched_4bsd.c ============================================================================== --- head/sys/kern/sched_4bsd.c Thu Aug 9 19:22:54 2012 (r239156) +++ head/sys/kern/sched_4bsd.c Thu Aug 9 19:26:13 2012 (r239157) @@ -105,6 +105,7 @@ struct td_sched { /* flags kept in td_flags */ #define TDF_DIDRUN TDF_SCHED0 /* thread actually ran. */ #define TDF_BOUND TDF_SCHED1 /* Bound to one CPU. */ +#define TDF_SLICEEND TDF_SCHED2 /* Thread time slice is over. */ /* flags kept in ts_flags */ #define TSF_AFFINITY 0x0001 /* Has a non-"full" CPU set. */ @@ -727,7 +728,7 @@ sched_clock(struct thread *td) */ if (!TD_IS_IDLETHREAD(td) && (--ts->ts_slice <= 0)) { ts->ts_slice = sched_slice; - td->td_flags |= TDF_NEEDRESCHED; + td->td_flags |= TDF_NEEDRESCHED | TDF_SLICEEND; } stat = DPCPU_PTR(idlestat); @@ -943,6 +944,7 @@ sched_switch(struct thread *td, struct t struct mtx *tmtx; struct td_sched *ts; struct proc *p; + int preempted; tmtx = NULL; ts = td->td_sched; @@ -964,8 +966,8 @@ sched_switch(struct thread *td, struct t sched_load_rem(); td->td_lastcpu = td->td_oncpu; - if (!(flags & SW_PREEMPT)) - td->td_flags &= ~TDF_NEEDRESCHED; + preempted = !(td->td_flags & TDF_SLICEEND); + td->td_flags &= ~(TDF_NEEDRESCHED | TDF_SLICEEND); td->td_owepreempt = 0; td->td_oncpu = NOCPU; @@ -983,7 +985,7 @@ sched_switch(struct thread *td, struct t } else { if (TD_IS_RUNNING(td)) { /* Put us back on the run queue. */ - sched_add(td, (flags & SW_PREEMPT) ? + sched_add(td, preempted ? SRQ_OURSELF|SRQ_YIELDING|SRQ_PREEMPTED : SRQ_OURSELF|SRQ_YIELDING); } Modified: head/sys/kern/sched_ule.c ============================================================================== --- head/sys/kern/sched_ule.c Thu Aug 9 19:22:54 2012 (r239156) +++ head/sys/kern/sched_ule.c Thu Aug 9 19:26:13 2012 (r239157) @@ -189,6 +189,9 @@ static struct td_sched td_sched0; #define SCHED_INTERACT_HALF (SCHED_INTERACT_MAX / 2) #define SCHED_INTERACT_THRESH (30) +/* Flags kept in td_flags. */ +#define TDF_SLICEEND TDF_SCHED2 /* Thread time slice is over. */ + /* * tickincr: Converts a stathz tick into a hz domain scaled by * the shift factor. Without the shift the error rate @@ -1841,7 +1844,7 @@ sched_switch(struct thread *td, struct t struct td_sched *ts; struct mtx *mtx; int srqflag; - int cpuid; + int cpuid, preempted; THREAD_LOCK_ASSERT(td, MA_OWNED); KASSERT(newtd == NULL, ("sched_switch: Unsupported newtd argument")); @@ -1854,8 +1857,8 @@ sched_switch(struct thread *td, struct t ts->ts_rltick = ticks; td->td_lastcpu = td->td_oncpu; td->td_oncpu = NOCPU; - if (!(flags & SW_PREEMPT)) - td->td_flags &= ~TDF_NEEDRESCHED; + preempted = !(td->td_flags & TDF_SLICEEND); + td->td_flags &= ~(TDF_NEEDRESCHED | TDF_SLICEEND); td->td_owepreempt = 0; tdq->tdq_switchcnt++; /* @@ -1867,7 +1870,7 @@ sched_switch(struct thread *td, struct t TD_SET_CAN_RUN(td); } else if (TD_IS_RUNNING(td)) { MPASS(td->td_lock == TDQ_LOCKPTR(tdq)); - srqflag = (flags & SW_PREEMPT) ? + srqflag = preempted ? SRQ_OURSELF|SRQ_YIELDING|SRQ_PREEMPTED : SRQ_OURSELF|SRQ_YIELDING; #ifdef SMP @@ -2237,7 +2240,7 @@ sched_clock(struct thread *td) * We're out of time, force a requeue at userret(). */ ts->ts_slice = sched_slice; - td->td_flags |= TDF_NEEDRESCHED; + td->td_flags |= TDF_NEEDRESCHED | TDF_SLICEEND; } /*