From owner-freebsd-current@FreeBSD.ORG Tue Aug 31 18:53:14 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0E7FB10656AD for ; Tue, 31 Aug 2010 18:53:14 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 91BFC8FC18 for ; Tue, 31 Aug 2010 18:53:13 +0000 (UTC) Received: by eyx24 with SMTP id 24so4438160eyx.13 for ; Tue, 31 Aug 2010 11:53:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=MFWdhAO8WUELWA/vZKKXgkgfk6GoNeBuStRP2u0f6W0=; b=angDhSSYx6nxa1v5OR0fN/YPGRzEbiEpYNhJm8zEZP0OIi16bmbropHPL2k24heXy4 2InMXe0u3EtWfX8LoBaF4ekiEkq1x+NzXmv9BUn3uHQwi61j34HQT4kqAdaEz62OCGM9 tKJ9D3ohkf+Z4nWZx/qskW2Lf8rWE9zWeleKI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=umHw2Wt6ao6GOUkt9vju9Jv7syAQzkW7bcA6RgEfO9+x2V7b4127579xqlyClHbfVF mGwuoed4TM1KcAdc7wKo/2xVdjJyTj4hGbLJhDpNEl+GFw7NpjbTZiCPK432Ec98NE1Z w/2rU0FExVV/rdV/B/76NsiG6e3CTOQsnfvWk= MIME-Version: 1.0 Received: by 10.213.20.132 with SMTP id f4mr10350885ebb.19.1283280792258; Tue, 31 Aug 2010 11:53:12 -0700 (PDT) Sender: mdf356@gmail.com Received: by 10.213.20.144 with HTTP; Tue, 31 Aug 2010 11:53:12 -0700 (PDT) In-Reply-To: References: <201008261649.20543.jhb@freebsd.org> Date: Tue, 31 Aug 2010 11:53:12 -0700 X-Google-Sender-Auth: S6lYZN4sov8GuzvpV_ZG3JCguBY Message-ID: From: mdf@FreeBSD.org To: freebsd-current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Jeff Roberson Subject: Re: sched_pin() bug in SCHED_ULE X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Aug 2010 18:53:14 -0000 On Tue, Aug 31, 2010 at 10:16 AM, wrote: > I recorded the stack any time ts->ts_cpu was set and when a thread was > migrated by sched_switch() I printed out the recorded info. =A0Here's > what I found: > > > XXX bug 67957: moving 0xffffff003ff9b800 from 3 to 1 > [1]: pin 0 state 4 move 3 -> 1 done by 0xffffff000cc44000: > #0 0xffffffff802b36b4 at bug67957+0x84 > #1 0xffffffff802b5dd4 at sched_affinity+0xd4 > #2 0xffffffff8024a707 at cpuset_setthread+0x137 > #3 0xffffffff8024aeae at cpuset_setaffinity+0x21e > #4 0xffffffff804a82df at freebsd32_cpuset_setaffinity+0x4f > #5 0xffffffff80295f49 at isi_syscall+0x99 > #6 0xffffffff804a630e at ia32_syscall+0x1ce > #7 0xffffffff8046dc60 at Xint0x80_syscall+0x60 > [0]: pin 0 state 2 move 0 -> 3 done by 0xffffff000cc44000: > #0 0xffffffff802b36b4 at bug67957+0x84 > #1 0xffffffff802b4ad8 at sched_add+0xe8 > #2 0xffffffff8029b96a at create_thread+0x34a > #3 0xffffffff8029badc at kern_thr_new+0x8c > #4 0xffffffff804a8912 at freebsd32_thr_new+0x122 > #5 0xffffffff80295f49 at isi_syscall+0x99 > #6 0xffffffff804a630e at ia32_syscall+0x1ce > #7 0xffffffff8046dc60 at Xint0x80_syscall+0x60 > > So one thread in the process called cpuset_setaffinity(2), and another > thread in the process was forcibly migrated by the IPI while returning > from a syscall, while it had td_pinned set. > > Given this path, it seems reasonable to me to skip the migrate if we > notice THREAD_CAN_MIGRATE is false. > > Opinions? =A0My debug code is below. =A0I'll try to write a short testcas= e > that exhibits this bug. Just a few more thoughts on this. The check in sched_affinity for THREAD_CAN_MIGRATE is racy. Since witness uses sched_pin, it's not simple to take the THREAD lock around an increment of td_pinned. So I'm looking for suggestions on the best way to fix this issue. My thoughts: 1) add a check in sched_switch() for THREAD_CAN_MIGRATE 2) have WITNESS not use sched_pin, and take the THREAD lock when modifying td_pinned 3) have the IPI_PREEMPT handler notice the thread is pinned (and not trying to bind) and postpone the mi_switch, just like it postpones when a thread is in a critical section. Except for the potential complexity of implementation, I think (2) is the best solution. For those who want to play at home, I have a small test program that exhibits this behavior at http://people.freebsd.org/~mdf/cpu_affinity_test.c. It seems to require 4 or more CPUs to hit the assert. You'll also need to patch the kernel to assert when migrating a pinned thread: Index: kern/sched_ule.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- kern/sched_ule.c (revision 158580) +++ kern/sched_ule.c (working copy) @@ -1888,11 +1889,26 @@ sched_switch(struct thread *td, struct t srqflag =3D (flags & SW_PREEMPT) ? SRQ_OURSELF|SRQ_YIELDING|SRQ_PREEMPTED : SRQ_OURSELF|SRQ_YIELDING; if (ts->ts_cpu =3D=3D cpuid) tdq_add(tdq, td, srqflag); - else + else { + KASSERT(THREAD_CAN_MIGRATE(td) || + (ts->ts_flags & TSF_BOUND) !=3D 0, + ("Thread %p shouldn't migrate!", td)); mtx =3D sched_switch_migrate(tdq, td, srqflag); + } } else { /* This thread must be going to sleep. */ TDQ_LOCK(tdq); mtx =3D thread_lock_block(td); Thanks, matthew