Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 01 Apr 2014 09:40:37 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Idwer Vollering <vidwer@gmail.com>, freebsd-fs@FreeBSD.org
Subject:   Re: ZFS panic: spin lock held too long
Message-ID:  <533A5F65.7020800@FreeBSD.org>
In-Reply-To: <CAPp9Orm0V91pbCS=Yygd1zuFkcq6pAWbxosWSAz8jUWGX3bxEg@mail.gmail.com>
References:  <CAPp9OrnFL-bNbocA%2B%2BUqSb%2BW79QHr5=q_ZFQdod3N8SZ8=iOFQ@mail.gmail.com> <53391F6C.9070208@FreeBSD.org> <CAPp9Orm0V91pbCS=Yygd1zuFkcq6pAWbxosWSAz8jUWGX3bxEg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
on 31/03/2014 14:56 Idwer Vollering said the following:
> 2014-03-31 9:55 GMT+02:00 Andriy Gapon <avg@freebsd.org>:
>> on 30/03/2014 21:43 Idwer Vollering said the following:
>>> Unread portion of the kernel message buffer:
>>> spin lock 0xffffffff814fa030 (smp rendezvous) held by
>>> 0xfffff8000fb42920 (tid 100428) too long
>>
>> Please note the tid and obtain a stack trace for that thread.
>> You can switch to the thread using 'tid' command in kgdb.
> 
> Like this?

Yes.

> ==== vmcore.0 ====
> 
> (kgdb) tid 100428
> [Switching to thread 266 (Thread 100428)]#0  0xffffffff80c7f478 in
> cpustop_handler ()
>     at /usr/src/sys/amd64/amd64/mp_machdep.c:1432
> 1432            savectx(&stoppcbs[cpu]);
> (kgdb) bt
> #0  0xffffffff80c7f478 in cpustop_handler () at
> /usr/src/sys/amd64/amd64/mp_machdep.c:1432
> #1  0xffffffff80c7f43f in ipi_nmi_handler () at
> /usr/src/sys/amd64/amd64/mp_machdep.c:1417
> #2  0xffffffff80c8db52 in trap (frame=0xfffffe020b2fdf30) at
> /usr/src/sys/amd64/amd64/trap.c:211
> #3  0xffffffff80c757d3 in nmi_calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:505
> #4  0xffffffff80c7ed42 in smp_tlb_shootdown (vector=<value optimized
> out>, pmap=<value optimized out>,
>     addr1=<value optimized out>, addr2=18446741875183812608) at cpufunc.h:309

So, this thread is stuck waiting on some CPU(s) doing TLB shootdown.
This must mean that that CPU is stuck doing something.
I can not provide exact instructions on how to find out which CPU is stuck and
what it is doing, but you could try to start with examining output of 'thread
apply all bt'.

> #5  0xffffffff80c8066c in pmap_invalidate_range (pmap=<value optimized
> out>, sva=<value optimized out>,
>     eva=<value optimized out>) at /usr/src/sys/amd64/amd64/pmap.c:1441
> #6  0xffffffff80c81d73 in pmap_remove (pmap=0xffffffff81520958,
> sva=18446741875183812608, eva=18446741875183812608)
>     at /usr/src/sys/amd64/amd64/pmap.c:3698
> #7  0xffffffff80b0fc53 in kmem_unback (object=0xffffffff814fecf8,
> addr=18446741875183804416, size=8192)
>     at /usr/src/sys/vm/vm_kern.c:401
> #8  0xffffffff80b0fd44 in kmem_free (vmem=0xffffffff8147a300,
> addr=18446741875183804416, size=8192)
>     at /usr/src/sys/vm/vm_kern.c:421
> #9  0xffffffff80b08d6c in uma_large_free (slab=0xfffff801a62e7640) at
> /usr/src/sys/vm/uma_core.c:1097
> #10 0xffffffff80898d17 in free (addr=<value optimized out>,
> mtp=0xffffffff81397420)
>     at /usr/src/sys/kern/kern_malloc.c:599
> #11 0xffffffff808f06b6 in sbuf_delete (s=0xfffffe0215613718) at
> /usr/src/sys/kern/subr_sbuf.c:761
> #12 0xffffffff808735cb in sysctl_kern_proc_filedesc (oidp=<value
> optimized out>, arg1=<value optimized out>,
>     arg2=-8791816929280, req=<value optimized out>) at
> /usr/src/sys/kern/kern_descrip.c:3540
> #13 0xffffffff808bae0f in sysctl_root (arg1=<value optimized out>,
> arg2=<value optimized out>)
>     at /usr/src/sys/kern/kern_sysctl.c:1497
> #14 0xffffffff808bb3c8 in userland_sysctl (td=<value optimized out>,
> name=0xfffffe02156138b0,
>     namelen=<value optimized out>, old=<value optimized out>,
> oldlenp=<value optimized out>,
>     inkernel=<value optimized out>, new=<value optimized out>,
> retval=<value optimized out>, flags=0)
>     at /usr/src/sys/kern/kern_sysctl.c:1607
> #15 0xffffffff808bb1b4 in sys___sysctl (td=0xfffff8000fb42920,
> uap=0xfffffe02156139c0)
>     at /usr/src/sys/kern/kern_sysctl.c:1533
> #16 0xffffffff80c8ef87 in amd64_syscall (td=0xfffff8000fb42920,
> traced=0) at subr_syscall.c:134
> #17 0xffffffff80c7567b in Xfast_syscall () at
> /usr/src/sys/amd64/amd64/exception.S:391
> #18 0x000000080102b73a in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> 
> ==== vmcore.1 ====
> 
> (kgdb) tid 100396
> [Switching to thread 263 (Thread 100396)]#0  0xffffffff80c7f478 in
> cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1432
> 1432            savectx(&stoppcbs[cpu]);
> (kgdb) bt
> #0  0xffffffff80c7f478 in cpustop_handler () at
> /usr/src/sys/amd64/amd64/mp_machdep.c:1432
> #1  0xffffffff80c7f43f in ipi_nmi_handler () at
> /usr/src/sys/amd64/amd64/mp_machdep.c:1417
> #2  0xffffffff80c8db52 in trap (frame=0xfffffe020b305f30) at
> /usr/src/sys/amd64/amd64/trap.c:211
> #3  0xffffffff80c757d3 in nmi_calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:505
> #4  0xffffffff80c7ed42 in smp_tlb_shootdown (vector=<value optimized
> out>, pmap=<value optimized out>, addr1=<value optimized out>,
>     addr2=18446741874812944384) at cpufunc.h:309
> #5  0xffffffff80c8066c in pmap_invalidate_range (pmap=<value optimized
> out>, sva=<value optimized out>, eva=<value optimized out>)
>     at /usr/src/sys/amd64/amd64/pmap.c:1441
> #6  0xffffffff80c81d73 in pmap_remove (pmap=0xffffffff81520958,
> sva=18446741874812944384, eva=18446741874812944384) at
> /usr/src/sys/amd64/amd64/pmap.c:3698
> #7  0xffffffff80b0fc53 in kmem_unback (object=0xffffffff814fecf8,
> addr=18446741874812936192, size=8192) at /usr/src/sys/vm/vm_kern.c:401
> #8  0xffffffff80b0fd44 in kmem_free (vmem=0xffffffff8147a300,
> addr=18446741874812936192, size=8192) at /usr/src/sys/vm/vm_kern.c:421
> #9  0xffffffff80b08d6c in uma_large_free (slab=0xfffff800acffcf00) at
> /usr/src/sys/vm/uma_core.c:1097
> #10 0xffffffff80898d17 in free (addr=<value optimized out>,
> mtp=0xffffffff81a2cc40) at /usr/src/sys/kern/kern_malloc.c:599
> #11 0xffffffff81890bb9 in zfsdev_ioctl () from /boot/kernel/zfs.ko
> #12 0xffffffff807ac16f in devfs_ioctl_f (fp=0xfffff800ac0f18a0, com=0,
> data=0x0, cred=<value optimized out>, td=0xfffffe00078c6000)
>     at /usr/src/sys/fs/devfs/devfs_vnops.c:757
> #13 0xffffffff808fdeee in kern_ioctl (td=0xfffff8000e525920, fd=<value
> optimized out>, com=786678) at file.h:319
> #14 0xffffffff808fdc6f in sys_ioctl (td=0xfffff8000e525920,
> uap=0xfffffe02155e39c0) at /usr/src/sys/kern/sys_generic.c:702
> #15 0xffffffff80c8ef87 in amd64_syscall (td=0xfffff8000e525920,
> traced=0) at subr_syscall.c:134
> #16 0xffffffff80c7567b in Xfast_syscall () at
> /usr/src/sys/amd64/amd64/exception.S:391
> #17 0x00000008019e308a in ?? ()


-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?533A5F65.7020800>