From owner-freebsd-hackers@FreeBSD.ORG Sun May 18 15:48:45 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 278C77EA for ; Sun, 18 May 2014 15:48:45 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0017E2446 for ; Sun, 18 May 2014 15:48:44 +0000 (UTC) Received: from John-Baldwins-MacBook-Pro.local (CPE7cb21b17bdec-CM7cb21b17bde9.cpe.net.cable.rogers.com [99.241.40.247]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 12AB3B941; Sun, 18 May 2014 11:48:43 -0400 (EDT) Message-ID: <5378D65A.90905@FreeBSD.org> Date: Sun, 18 May 2014 11:48:42 -0400 From: John Baldwin User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Chris Torek , freebsd-hackers@freebsd.org Subject: Re: bad call to sched_bind() => hang, even with INVARIANTS References: <201405151720.s4FHK8RC065430@elf.torek.net> In-Reply-To: <201405151720.s4FHK8RC065430@elf.torek.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Sun, 18 May 2014 11:48:43 -0400 (EDT) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 May 2014 15:48:45 -0000 On 5/15/14, 1:20 PM, Chris Torek wrote: > I was poking around with a bhyve emulation, where the emulation > has only one CPU but the real systems have more. > > In our real-system code we had a sched_bind() that just assumed there > were 2 or more CPUs, instead of just the 1. This caused the entire > system to hang. (Note: using SCHED_ULE.) > > It's not immediately obvious to me what went wrong "underneath" to > cause the whole-system hang, but clearly it is wrong to attempt to > pin a thread to a CPU that does not exist. Should sched_bind() > have a KASSERT in it to make sure that the cpu argument is > sensible? (Or maybe even something a little more aggressive?) A KASSERT() is probably a good idea. -- John Baldwin