From owner-freebsd-hackers@FreeBSD.ORG Sun Jun 29 09:29:34 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4FB7F915 for ; Sun, 29 Jun 2014 09:29:34 +0000 (UTC) Received: from elf.torek.net (50-73-42-1-utah.hfc.comcastbusiness.net [50.73.42.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0DD5628D1 for ; Sun, 29 Jun 2014 09:29:32 +0000 (UTC) Received: from elf.torek.net (localhost [127.0.0.1]) by elf.torek.net (8.14.5/8.14.5) with ESMTP id s5T9PrBL035430 for ; Sun, 29 Jun 2014 03:25:53 -0600 (MDT) (envelope-from torek@torek.net) Message-Id: <201406290925.s5T9PrBL035430@elf.torek.net> From: Chris Torek To: freebsd-hackers@freebsd.org Subject: sched_ule vs un-spun-up AP MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <35428.1404033953.1@elf.torek.net> Date: Sun, 29 Jun 2014 03:25:53 -0600 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (elf.torek.net [127.0.0.1]); Sun, 29 Jun 2014 03:25:53 -0600 (MDT) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jun 2014 09:29:34 -0000 This patch is possibly a bit of unnecessary fluff, but it fixes a panic we see with some rather odd code (said odd code needs to be redone) that tries to bind a thread to a CPU during startup. On real hardware, the APs are up by this point and have idle threads. In a bhyve emulation, however, the APs spin up much later, relatively speaking, so that when we reach the tdq_notify() code in sched_ule.c, the target CPU on which the bound thread is supposed to run is not up yet. pcpu_find(cpu) works fine but its pc_curthread is NULL. The test to see if the new thread should run instead, causes this: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x356 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff805aaaf2 stack pointer = 0x28:0xfffff800002a98c0 frame pointer = 0x28:0xfffff800002a98f0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 3 (evil-bound-thread) [ thread pid 3 tid 100022 ] Stopped at tdq_notify+0x32: movb 0x356(%rax),%cl db> A one-line patch works around it. A better fix would probably be to make sure the APs are up and running earlier, but I stuck the below in so that our guys could make progress on their code while we try to fix this other sort-of-evil early bound thread. Chris sched_ule: in tdq_notify, handle un-spun-up CPU If a thread tries to bind to a CPU that is present on an SMP system, but has yet to spin up, don't try to look at the thread priority on that CPU, as there is no thread (not even the idle thread) running. diff --git a/sys/kern/sched_ule.c b/sys/kern/sched_ule.c --- a/sys/kern/sched_ule.c +++ b/sys/kern/sched_ule.c @@ -1004,7 +1004,12 @@ tdq_notify(struct tdq *tdq, struct threa cpu = td->td_sched->ts_cpu; pri = td->td_priority; ctd = pcpu_find(cpu)->pc_curthread; - if (!sched_shouldpreempt(pri, ctd->td_priority, 1)) + /* + * Note: ctd==NULL occurs only when binding to a cpu + * that has not yet finished coming up (so it has no + * idle thread). + */ + if (ctd == NULL || !sched_shouldpreempt(pri, ctd->td_priority, 1)) return; if (TD_IS_IDLETHREAD(ctd)) { /*