From owner-freebsd-current@FreeBSD.ORG Thu Aug 26 20:49:31 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5B55D10656A4; Thu, 26 Aug 2010 20:49:31 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 135988FC18; Thu, 26 Aug 2010 20:49:31 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 9949C46B91; Thu, 26 Aug 2010 16:49:30 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 65BFE8A03C; Thu, 26 Aug 2010 16:49:29 -0400 (EDT) From: John Baldwin To: freebsd-current@freebsd.org Date: Thu, 26 Aug 2010 16:49:20 -0400 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201008261649.20543.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Thu, 26 Aug 2010 16:49:29 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: mdf@freebsd.org, Jeff Roberson Subject: Re: sched_pin() bug in SCHED_ULE X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Aug 2010 20:49:31 -0000 On Thursday, August 26, 2010 4:03:38 pm mdf@freebsd.org wrote: > Back at the beginning of August I posted about issues with sched_pin() > and witness_warn(): > > http://lists.freebsd.org/pipermail/freebsd-hackers/2010-August/032553.html > > After a lot of debugging I think I've basically found the issue. I > found this bug on stable/7, but looking at the commit log for CURRENT > I don't see any reason the bug doesn't exist there. This analysis is > specific to amd64/i386 but the problem is likely to exist on most > arches. > > The basic problem is that in both tdq_move() and sched_setcpu() can > manipulate the ts->ts_cpu variable of another process, which may be > running. This means that a process running on CPU N may be set to run > on CPU M the next context switch. > > Then, that process may call sched_pin(), then grab a PCPU variable. > An IPI_PREEMPT can come in, causing the thread to call mi_switch() in > ipi_bitmap_handler(). sched_switch() will then notice that ts->ts_cpu > is not PCPU_GET(cpuid), and call sched_switch_migrate(), migrating the > thread to the intended CPU. Thus after sched_pin() and potentially > before using any PCPU variable the process can get migrated to another > CPU, and now it is using a PCPU variable for the wrong processor. > > Given that ts_cpu can be set by other threads, it doesn't seem worth > checking at sched_pin time whether ts_cpu is not the same as td_oncpu, > because to make the check would require taking the thread_lock. > Instead, I propose adding a check for !THREAD_CAN_MIGRATE() before > calling sched_switch_migrate(). However, sched_pin() is also used by > sched_bind(9) to force the thread to stay on the intended cpu. I > would handle this by adding a TSF_BINDING state that is different from > TSF_BOUND. > > The thing that has me wondering whether this is all correct is that I > don't see any checks in tdq_move() or sched_setcpu() for either > TSF_BOUND or THREAD_CAN_MIGRATE(). Perhaps that logic is managed in > the various calling functions; in that case I would propose adding > asserts that the thread is THREAD_CAN_MIGRATE() unless it's marked > TSF_BINDING. > > Does this analysis seem correct? Calling code does check THREAD_CAN_MIGRATE(), but the problem is that it is not safe to call that on anything but curthread as td_pinned is not locked. It is assumed that only curthread would ever check td_pinned, not other threads. I suspect what is happening is that another CPU is seeing a stale value of td_pinned. You could fix this by grabbing the thread lock in sched_pin()/unpin() for ULE, but that is a bit expensive perhaps. You could test your theory by disabling thread stealing btw. There are a few sysctl knobs to turn it off. -- John Baldwin