From owner-freebsd-current@FreeBSD.ORG Tue Aug 31 17:16:01 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D380F10656BE for ; Tue, 31 Aug 2010 17:16:01 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-ew0-f54.google.com (mail-ew0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 647FE8FC18 for ; Tue, 31 Aug 2010 17:16:00 +0000 (UTC) Received: by ewy4 with SMTP id 4so4210149ewy.13 for ; Tue, 31 Aug 2010 10:16:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type; bh=8LBsqPe+v34wWXwopeaLRcZGcIaU4F6D1+6hb/P9dVM=; b=kVI8PGlPlv5WODd3+3cKHxirg1zfIpgr2gkWUeD2tCf/CLDU8pNlQ6WklSGQa2omHm NwpyW1ssYRxMvLGvgwGHGrUiPjhLJa9qFinGvIsl/tjHW2USbo5t94nCq6rmmE8N04iI rf1NH/0I3M2e/YBrSHYHibnIdptB7ABBaKfBA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=Tb8C1Rc5dekggRpdEE3fGO2U/DNrtTj9KhffSy/2SfS29UJYIzCmXaxbJb3xv0Z8pa A0opxcCPQMr45UsL4JfoyIK6tg6baytT0WF6vs4gYgRRIA2AeSPUB/rzar0KeE//8C6A 7ANbh9DPV2zGwAZu9bRzaANq6riJSPhBSVql8= MIME-Version: 1.0 Received: by 10.213.7.7 with SMTP id b7mr9915435ebb.30.1283274960203; Tue, 31 Aug 2010 10:16:00 -0700 (PDT) Sender: mdf356@gmail.com Received: by 10.213.20.144 with HTTP; Tue, 31 Aug 2010 10:16:00 -0700 (PDT) In-Reply-To: <201008261649.20543.jhb@freebsd.org> References: <201008261649.20543.jhb@freebsd.org> Date: Tue, 31 Aug 2010 10:16:00 -0700 X-Google-Sender-Auth: 0mwaHw1E8MOtxkGuyPsJnETmWno Message-ID: From: mdf@FreeBSD.org To: freebsd-current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Jeff Roberson Subject: Re: sched_pin() bug in SCHED_ULE X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Aug 2010 17:16:01 -0000 I recorded the stack any time ts->ts_cpu was set and when a thread was migrated by sched_switch() I printed out the recorded info. Here's what I found: XXX bug 67957: moving 0xffffff003ff9b800 from 3 to 1 [1]: pin 0 state 4 move 3 -> 1 done by 0xffffff000cc44000: #0 0xffffffff802b36b4 at bug67957+0x84 #1 0xffffffff802b5dd4 at sched_affinity+0xd4 #2 0xffffffff8024a707 at cpuset_setthread+0x137 #3 0xffffffff8024aeae at cpuset_setaffinity+0x21e #4 0xffffffff804a82df at freebsd32_cpuset_setaffinity+0x4f #5 0xffffffff80295f49 at isi_syscall+0x99 #6 0xffffffff804a630e at ia32_syscall+0x1ce #7 0xffffffff8046dc60 at Xint0x80_syscall+0x60 [0]: pin 0 state 2 move 0 -> 3 done by 0xffffff000cc44000: #0 0xffffffff802b36b4 at bug67957+0x84 #1 0xffffffff802b4ad8 at sched_add+0xe8 #2 0xffffffff8029b96a at create_thread+0x34a #3 0xffffffff8029badc at kern_thr_new+0x8c #4 0xffffffff804a8912 at freebsd32_thr_new+0x122 #5 0xffffffff80295f49 at isi_syscall+0x99 #6 0xffffffff804a630e at ia32_syscall+0x1ce #7 0xffffffff8046dc60 at Xint0x80_syscall+0x60 So one thread in the process called cpuset_setaffinity(2), and another thread in the process was forcibly migrated by the IPI while returning from a syscall, while it had td_pinned set. Given this path, it seems reasonable to me to skip the migrate if we notice THREAD_CAN_MIGRATE is false. Opinions? My debug code is below. I'll try to write a short testcase that exhibits this bug. Thanks, matthew Index: kern/sched_ule.c =================================================================== --- kern/sched_ule.c (revision 158580) +++ kern/sched_ule.c (working copy) @@ -697,6 +697,41 @@ return; } +static void +bug67957(struct thread *td) +{ + int idx; + + THREAD_LOCK_ASSERT(td, MA_OWNED); + idx = (td->xxx_idx++ % 5); + stack_save(&td->xxx[idx].td_preempt); + td->xxx[idx].td_moveto = td->td_sched->ts_cpu; + td->xxx[idx].td_movefrom = (td->td_oncpu == NOCPU) ? td->td_lastcpu : td->td_oncpu; + td->xxx[idx].td_statewas = td->td_state; + td->xxx[idx].td_pinned = td->td_pinned; + td->xxx[idx].td_by = curthread; +} + +static void +pr_bug67957(struct thread *td, int idx) +{ + int idx, i; + + printf("XXX bug 67957: moving %p from %d to %d\n", + td, td->td_lastcpu, td->td_sched->ts_cpu); + for (i = 0, idx = td->xxx_idx - 1; + i < 5 && idx >= 0; + i++, idx--) { + printf("[%d]: pin %d state %d move %d -> %d done by %p:\n", + idx, td->xxx[idx % 5].td_pinned, + td->xxx[idx % 5].td_statewas, + td->xxx[idx % 5].td_movefrom, + td->xxx[idx % 5].td_moveto, + td->xxx[idx % 5].td_by); + stack_print_ddb(&td->xxx[idx % 5].td_preempt); + } +} + /* * Move a thread from one thread queue to another. */ @@ -739,6 +774,7 @@ TDQ_UNLOCK(from); sched_rem(td); ts->ts_cpu = cpu; + bug67957(td); td->td_lock = TDQ_LOCKPTR(to); tdq_add(to, td, SRQ_YIELDING); } @@ -971,6 +1007,7 @@ tdq = TDQ_CPU(cpu); td = ts->ts_thread; ts->ts_cpu = cpu; + bug67957(td); /* If the lock matches just return the queue. */ if (td->td_lock == TDQ_LOCKPTR(tdq)) @@ -1890,8 +1964,15 @@ SRQ_OURSELF|SRQ_YIELDING; if (ts->ts_cpu == cpuid) tdq_add(tdq, td, srqflag); - else + else { + if (!THREAD_CAN_MIGRATE(td) && + (ts->ts_flags & TSF_BOUND) == 0) { + pr_bug67957(td, idx); + panic("XXX"); + } mtx = sched_switch_migrate(tdq, td, srqflag); + } + td->xxx_idx = 0; } else { /* This thread must be going to sleep. */ TDQ_LOCK(tdq); @@ -2479,8 +2560,10 @@ * target cpu. */ if (td->td_priority <= PRI_MAX_ITHD && THREAD_CAN_MIGRATE(td) && - curthread->td_intr_nesting_level) + curthread->td_intr_nesting_level) { ts->ts_cpu = cpuid; + bug67957(td); + } if (!THREAD_CAN_MIGRATE(td)) cpu = ts->ts_cpu; else @@ -2590,6 +2673,7 @@ */ cpu = ts->ts_cpu; ts->ts_cpu = sched_pickcpu(td, 0); + bug67957(td); if (cpu != PCPU_GET(cpuid)) ipi_selected(1 << cpu, IPI_PREEMPT); #endif @@ -2613,6 +2697,7 @@ if (PCPU_GET(cpuid) == cpu) return; ts->ts_cpu = cpu; + bug67957(td); /* When we return from mi_switch we'll be on the correct cpu. */ mi_switch(SW_VOL, NULL); #endif Index: sys/proc.h =================================================================== --- sys/proc.h (revision 158580) +++ sys/proc.h (working copy) @@ -68,6 +68,8 @@ #include #include +#include /* XXX bug 67957 */ + /* * One structure allocated per session. * @@ -356,6 +358,16 @@ int td_errno; /* Error returned by last syscall. */ uint64_t td_nfs_root_lin;/* (k) Root lin for vis .snapshot*/ struct osd td_osd; /* (k) Object specific data. */ + + struct { + struct stack td_preempt;/* XXX bug 67957 */ + u_char td_movefrom; + u_char td_moveto;/* XXX bug 67957 */ + u_char td_statewas;/* XXX bug 67957 */ + u_char td_pinned; + struct thread *td_by; + } xxx[5]; + int xxx_idx;/* XXX bug 67957 */ }; struct mtx *thread_lock_block(struct thread *);