From owner-svn-src-stable-7@FreeBSD.ORG Wed Sep 30 19:40:51 2009 Return-Path: Delivered-To: svn-src-stable-7@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 806001065692; Wed, 30 Sep 2009 19:40:51 +0000 (UTC) (envelope-from zml@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 65FCF8FC22; Wed, 30 Sep 2009 19:40:51 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n8UJep8j024251; Wed, 30 Sep 2009 19:40:51 GMT (envelope-from zml@svn.freebsd.org) Received: (from zml@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id n8UJep9X024249; Wed, 30 Sep 2009 19:40:51 GMT (envelope-from zml@svn.freebsd.org) Message-Id: <200909301940.n8UJep9X024249@svn.freebsd.org> From: Zachary Loafman Date: Wed, 30 Sep 2009 19:40:51 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-7@freebsd.org X-SVN-Group: stable-7 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r197652 - stable/7/sys/kern X-BeenThere: svn-src-stable-7@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for only the 7-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Sep 2009 19:40:51 -0000 Author: zml Date: Wed Sep 30 19:40:51 2009 New Revision: 197652 URL: http://svn.freebsd.org/changeset/base/197652 Log: sched_ule in stable/7 has a bug (introduced in r180607) where a thread that is running often will appear to not be running much at all. sched_ule has a much less accurate mechanism for determining how much various threads are running. Every tick, hardclock_cpu() calls sched_tick(), and the currently running thread gets it's ts_ticks incremented. Whenever an event of interest happens to a thread, the ts_ticks value may be decayed; it's supposed to be a rough running average of the last 10 seconds. So there's a ts_ltick which is the last tick we looked at decaying ts_ticks. The increment in sched_tick() was slightly buggy on SMP, because a thread could get incremented on two different CPUs in the rare case where it was swapped from one which had run sched_tick() this tick to one which hadn't. The fix that was used relied on ts_ltick and only incremented ts_ticks if ts_ltick was not from the current tick. This is buggy, because any time the thread began running on a CPU in the current tick, we would have set ts_ltick to ticks, so if it was still running at sched_tick() we wouldn't increment. A system with a single process that hogs the CPU and is otherwise idle, therefore, would look like all threads were at 0%. The threads not running are really at 0%, and the hog is not getting its ts_ticks incremented since it went through some other runq stats that set ts_ltick. On a 2-way SMP the thread used to get shuffled regularly between CPUs (I think fallout from this bug), so it would appear a little over 50% busy. The fix is to use a separate variable to record when the last sched_tick() increment happened. Submitted by: Matthew Fleming (matthew.fleming at isilon.com) Reviewed by: zml, dfr Approved by: dfr (mentor) Modified: stable/7/sys/kern/sched_ule.c Modified: stable/7/sys/kern/sched_ule.c ============================================================================== --- stable/7/sys/kern/sched_ule.c Wed Sep 30 19:19:53 2009 (r197651) +++ stable/7/sys/kern/sched_ule.c Wed Sep 30 19:40:51 2009 (r197652) @@ -101,6 +101,7 @@ struct td_sched { u_int ts_runtime; /* Number of ticks we were running */ /* The following variables are only used for pctcpu calculation */ int ts_ltick; /* Last tick that we were running on */ + int ts_incrtick; /* Last tick that we incremented on */ int ts_ftick; /* First tick that we were running on */ int ts_ticks; /* Tick count */ #ifdef SMP @@ -2075,6 +2076,7 @@ sched_fork_thread(struct thread *td, str */ ts2->ts_ticks = ts->ts_ticks; ts2->ts_ltick = ts->ts_ltick; + ts2->ts_incrtick = ts->ts_incrtick; ts2->ts_ftick = ts->ts_ftick; child->td_user_pri = td->td_user_pri; child->td_base_user_pri = td->td_base_user_pri; @@ -2266,10 +2268,11 @@ sched_tick(void) * Ticks is updated asynchronously on a single cpu. Check here to * avoid incrementing ts_ticks multiple times in a single tick. */ - if (ts->ts_ltick == ticks) + if (ts->ts_incrtick == ticks) return; /* Adjust ticks for pctcpu */ ts->ts_ticks += 1 << SCHED_TICK_SHIFT; + ts->ts_incrtick = ticks; ts->ts_ltick = ticks; /* * Update if we've exceeded our desired tick threshhold by over one