From owner-svn-src-all@freebsd.org Sat Nov 12 00:14:14 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 347F2C3A69E; Sat, 12 Nov 2016 00:14:14 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 116C8C14; Sat, 12 Nov 2016 00:14:14 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id uAC0EDAt014780; Sat, 12 Nov 2016 00:14:13 GMT (envelope-from jhb@FreeBSD.org) Received: (from jhb@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id uAC0EDQs014779; Sat, 12 Nov 2016 00:14:13 GMT (envelope-from jhb@FreeBSD.org) Message-Id: <201611120014.uAC0EDQs014779@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: jhb set sender to jhb@FreeBSD.org using -f From: John Baldwin Date: Sat, 12 Nov 2016 00:14:13 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r308564 - head/sys/kern X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Nov 2016 00:14:14 -0000 Author: jhb Date: Sat Nov 12 00:14:13 2016 New Revision: 308564 URL: https://svnweb.freebsd.org/changeset/base/308564 Log: Don't place threads on the run queue after waking up other CPUs. The other CPU might resume and see a still-empty runq and go back to sleep before sched_add() adds the thread to the runq. This results in a lost wakeup and a potential hang if the system is otherwise completely idle. The race originated due to a micro-optimization (my fault) in 4BSD in that it avoided putting a thread on the run queue if the scheduler was going to preempt to the new thread. To avoid complexity while fixing this race, just drop this optimization. 4BSD now always sets the "owepreempt" flag when a preemption is warranted and defers the actual preemption to the thread_unlock of the caller the same as ULE. MFC after: 2 weeks Sponsored by: Netflix Modified: head/sys/kern/sched_4bsd.c Modified: head/sys/kern/sched_4bsd.c ============================================================================== --- head/sys/kern/sched_4bsd.c Fri Nov 11 23:28:07 2016 (r308563) +++ head/sys/kern/sched_4bsd.c Sat Nov 12 00:14:13 2016 (r308564) @@ -308,9 +308,8 @@ maybe_resched(struct thread *td) /* * This function is called when a thread is about to be put on run queue * because it has been made runnable or its priority has been adjusted. It - * determines if the new thread should be immediately preempted to. If so, - * it switches to it and eventually returns true. If not, it returns false - * so that the caller may place the thread on an appropriate run queue. + * determines if the new thread should preempt the current thread. If so, + * it sets td_owepreempt to request a preemption. */ int maybe_preempt(struct thread *td) @@ -356,29 +355,8 @@ maybe_preempt(struct thread *td) return (0); #endif - if (ctd->td_critnest > 1) { - CTR1(KTR_PROC, "maybe_preempt: in critical section %d", - ctd->td_critnest); - ctd->td_owepreempt = 1; - return (0); - } - /* - * Thread is runnable but not yet put on system run queue. - */ - MPASS(ctd->td_lock == td->td_lock); - MPASS(TD_ON_RUNQ(td)); - TD_SET_RUNNING(td); - CTR3(KTR_PROC, "preempting to thread %p (pid %d, %s)\n", td, - td->td_proc->p_pid, td->td_name); - mi_switch(SW_INVOL | SW_PREEMPT | SWT_PREEMPT, td); - /* - * td's lock pointer may have changed. We have to return with it - * locked. - */ - spinlock_enter(); - thread_unlock(ctd); - thread_lock(td); - spinlock_exit(); + CTR0(KTR_PROC, "maybe_preempt: scheduling preemption"); + ctd->td_owepreempt = 1; return (1); #else return (0); @@ -1332,6 +1310,12 @@ sched_add(struct thread *td, int flags) ts->ts_runq = &runq; } + if ((td->td_flags & TDF_NOLOAD) == 0) + sched_load_add(); + runq_add(ts->ts_runq, td, flags); + if (cpu != NOCPU) + runq_length[cpu]++; + cpuid = PCPU_GET(cpuid); if (single_cpu && cpu != cpuid) { kick_other_cpu(td->td_priority, cpu); @@ -1348,18 +1332,10 @@ sched_add(struct thread *td, int flags) } if (!forwarded) { - if ((flags & SRQ_YIELDING) == 0 && maybe_preempt(td)) - return; - else + if (!maybe_preempt(td)) maybe_resched(td); } } - - if ((td->td_flags & TDF_NOLOAD) == 0) - sched_load_add(); - runq_add(ts->ts_runq, td, flags); - if (cpu != NOCPU) - runq_length[cpu]++; } #else /* SMP */ { @@ -1393,23 +1369,11 @@ sched_add(struct thread *td, int flags) CTR2(KTR_RUNQ, "sched_add: adding td_sched:%p (td:%p) to runq", ts, td); ts->ts_runq = &runq; - /* - * If we are yielding (on the way out anyhow) or the thread - * being saved is US, then don't try be smart about preemption - * or kicking off another CPU as it won't help and may hinder. - * In the YIEDLING case, we are about to run whoever is being - * put in the queue anyhow, and in the OURSELF case, we are - * putting ourself on the run queue which also only happens - * when we are about to yield. - */ - if ((flags & SRQ_YIELDING) == 0) { - if (maybe_preempt(td)) - return; - } if ((td->td_flags & TDF_NOLOAD) == 0) sched_load_add(); runq_add(ts->ts_runq, td, flags); - maybe_resched(td); + if (!maybe_preempt(td)) + maybe_resched(td); } #endif /* SMP */