Date: Thu, 5 Aug 2010 13:14:37 -0700 From: mdf@FreeBSD.org To: John Baldwin <jhb@freebsd.org> Cc: freebsd-hackers@freebsd.org Subject: Re: sched_pin() versus PCPU_GET Message-ID: <AANLkTinfTd1z%2B-zs4vOpqB7gDv9p5EzAx1rx%2BuRcVKC3@mail.gmail.com> In-Reply-To: <201008051312.25854.jhb@freebsd.org> References: <AANLkTikY20TxyeyqO5zP3zC-azb748kV-MdevPfm%2B8cq@mail.gmail.com> <201008041455.26066.jhb@freebsd.org> <AANLkTikvx9c=CjMcE7WsAZrxAxfqcDQEYOa0rWRBBXA5@mail.gmail.com> <201008051312.25854.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
>> (gdb) p panic_cpu
>> $9 = 2
>> (gdb) p dumptid
>> $12 = 100751
>> (gdb) p cpuhead.slh_first->pc_allcpu.sle_next->pc_curthread->td_tid
>> $14 = 100751
>>
>> (gdb) p *cpuhead.slh_first->pc_allcpu.sle_next
>> $6 = {
>> pc_curthread = 0xffffff00716d6960,
>> pc_cpuid = 2,
>> pc_spinlocks = 0xffffffff80803198,
>>
>> (gdb) p lock_list
>> $2 = (struct lock_list_entry *) 0xffffffff80803fb0
>>
>> (gdb) p *cpuhead.slh_first->pc_allcpu.sle_next->pc_allcpu.sle_next-
>>pc_allcpu.sle_next
>> $8 = {
>> pc_curthread = 0xffffff0005479960,
>> pc_cpuid = 0,
>> pc_spinlocks = 0xffffffff80803fb0,
>>
>> I.e. we're dumping on CPU 2, but the lock_list pointer that was saved
>> in the dump matches that of CPU 0.
>
> Can you print out the tid's for the two curthreads? It's not impossible that
> the thread migrated after calling panic. In fact we force threads to CPU 0
> during shutdown.
dumptid matches the pc_curthread for CPU 2 and is printed above.
The lock_list local variable matches the PCPU for CPU 0, which has tid:
(gdb) p cpuhead.slh_first->pc_allcpu.sle_next->pc_allcpu.sle_next->pc_allcpu.sle_next->pc_curthread->td_tid
$2 = 100005
(gdb) p cpuhead.slh_first->pc_allcpu.sle_next->pc_allcpu.sle_next->pc_allcpu.sle_next->pc_curthread->td_proc->p_comm
$3 = "idle: cpu0\000\000\000\000\000\000\000\000\000"
Note that lock_list->ll_count is now 0, but of course wasn't when we
panic'd. Also, the panic message showed "exclusive spin mutex sched
lock 0 (sched lock) r = 0 (0xffffffff806cf640) locked @
/build/mnt/src/sys/kern/sys_generic.c:826"; i.e. the lock was for CPU
0 as well. If we truly were returning to user space with that lock
held it would still be held and we'd still be on CPU 0.
Cheers,
matthew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTinfTd1z%2B-zs4vOpqB7gDv9p5EzAx1rx%2BuRcVKC3>
