From owner-freebsd-stable@freebsd.org Mon Nov 14 09:08:48 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F268DC40319 for ; Mon, 14 Nov 2016 09:08:47 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id D608E1499; Mon, 14 Nov 2016 09:08:46 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA03148; Mon, 14 Nov 2016 11:08:44 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1c6DFs-000CHM-AN; Mon, 14 Nov 2016 11:08:44 +0200 Subject: Re: Freebsd 11.0 RELEASE - ZFS deadlock To: Henri Hennebert , freebsd-stable@FreeBSD.org References: <0c223160-b76f-c635-bb15-4a068ba7efe7@restart.be> <43c9d4d4-1995-5626-d70a-f92a5b456629@FreeBSD.org> <9d1f9a76-5a8d-6eca-9a50-907d55099847@FreeBSD.org> <6bc95dce-31e1-3013-bfe3-7c2dd80f9d1e@restart.be> <23a66749-f138-1f1a-afae-c775f906ff37@restart.be> <8e7547ef-87f7-7fab-6f45-221e8cea1989@FreeBSD.org> <6d991cea-b420-531e-12cc-001e4aeed66b@restart.be> <67f2e8bd-bff0-f808-7557-7dabe5cad78c@FreeBSD.org> <1cb09c54-5f0e-2259-a41a-fefe76b4fe8b@restart.be> <9f20020b-e2f1-862b-c3fc-dc6ff94e301e@restart.be> <599c5a5b-aa08-2030-34f3-23ff19d09a9b@restart.be> <32686283-948a-6faf-7ded-ed8fcd23affb@FreeBSD.org> Cc: Konstantin Belousov From: Andriy Gapon Message-ID: Date: Mon, 14 Nov 2016 11:07:48 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Nov 2016 09:08:48 -0000 On 13/11/2016 15:28, Henri Hennebert wrote: > On 11/13/2016 11:06, Andriy Gapon wrote: >> On 12/11/2016 14:40, Henri Hennebert wrote: >>> I attatch it >> >> Thank you! >> So, these two threads are trying to get the lock in the exclusive mode: >> Thread 687 (Thread 101243): >> #0 sched_switch (td=0xfffff800b642b500, newtd=0xfffff8000285ea00, flags=> optimized out>) at /usr/src/sys/kern/sched_ule.c:1973 >> #1 0xffffffff80561ae2 in mi_switch (flags=, newtd=0x0) at >> /usr/src/sys/kern/kern_synch.c:455 >> #2 0xffffffff805ae8da in sleepq_wait (wchan=0x0, pri=0) at >> /usr/src/sys/kern/subr_sleepqueue.c:646 >> #3 0xffffffff8052f854 in sleeplk (lk=, flags=> optimized out>, ilk=, wmesg=0xffffffff813be535 "zfs", >> pri=, timo=51) at /usr/src/sys/kern/kern_lock.c:222 >> #4 0xffffffff8052f39d in __lockmgr_args (lk=, flags=> optimized out>, ilk=, wmesg=, >> pri=, timo=, file=> out>, line=) at /usr/src/sys/kern/kern_lock.c:958 >> #5 0xffffffff80616a8c in vop_stdlock (ap=) at lockmgr.h:98 >> #6 0xffffffff8093784d in VOP_LOCK1_APV (vop=, a=> optimized out>) at vnode_if.c:2087 >> #7 0xffffffff8063c5b3 in _vn_lock (vp=, flags=548864, >> file=, line=) at vnode_if.h:859 >> #8 0xffffffff8062a5f7 in vget (vp=0xfffff80049c2c000, flags=548864, >> td=0xfffff800b642b500) at /usr/src/sys/kern/vfs_subr.c:2523 >> #9 0xffffffff806118b9 in cache_lookup (dvp=, vpp=> optimized out>, cnp=, tsp=, >> ticksp=) at /usr/src/sys/kern/vfs_cache.c:686 >> #10 0xffffffff806133dc in vfs_cache_lookup (ap=) at >> /usr/src/sys/kern/vfs_cache.c:1081 >> #11 0xffffffff80935777 in VOP_LOOKUP_APV (vop=, a=> optimized out>) at vnode_if.c:127 >> #12 0xffffffff8061cdf1 in lookup (ndp=) at vnode_if.h:54 >> #13 0xffffffff8061c492 in namei (ndp=) at >> /usr/src/sys/kern/vfs_lookup.c:306 >> #14 0xffffffff80509395 in kern_execve (td=, args=> optimized out>, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:443 >> #15 0xffffffff80508ccc in sys_execve (td=0xfffff800b642b500, >> uap=0xfffffe010182cb80) at /usr/src/sys/kern/kern_exec.c:218 >> #16 0xffffffff808d449e in amd64_syscall (td=, traced=0) at >> subr_syscall.c:135 >> #17 0xffffffff808b7ddb in Xfast_syscall () at >> /usr/src/sys/amd64/amd64/exception.S:396 >> >> Thread 681 (Thread 101147): >> #0 sched_switch (td=0xfffff80065f4e500, newtd=0xfffff8000285f000, flags=> optimized out>) at /usr/src/sys/kern/sched_ule.c:1973 >> #1 0xffffffff80561ae2 in mi_switch (flags=, newtd=0x0) at >> /usr/src/sys/kern/kern_synch.c:455 >> #2 0xffffffff805ae8da in sleepq_wait (wchan=0x0, pri=0) at >> /usr/src/sys/kern/subr_sleepqueue.c:646 >> #3 0xffffffff8052f854 in sleeplk (lk=, flags=> optimized out>, ilk=, wmesg=0xffffffff813be535 "zfs", >> pri=, timo=51) at /usr/src/sys/kern/kern_lock.c:222 >> #4 0xffffffff8052f39d in __lockmgr_args (lk=, flags=> optimized out>, ilk=, wmesg=, >> pri=, timo=, file=> out>, line=) at /usr/src/sys/kern/kern_lock.c:958 >> #5 0xffffffff80616a8c in vop_stdlock (ap=) at lockmgr.h:98 >> #6 0xffffffff8093784d in VOP_LOCK1_APV (vop=, a=> optimized out>) at vnode_if.c:2087 >> #7 0xffffffff8063c5b3 in _vn_lock (vp=, flags=548864, >> file=, line=) at vnode_if.h:859 >> #8 0xffffffff8062a5f7 in vget (vp=0xfffff80049c2c000, flags=548864, >> td=0xfffff80065f4e500) at /usr/src/sys/kern/vfs_subr.c:2523 >> #9 0xffffffff806118b9 in cache_lookup (dvp=, vpp=> optimized out>, cnp=, tsp=, >> ticksp=) at /usr/src/sys/kern/vfs_cache.c:686 >> #10 0xffffffff806133dc in vfs_cache_lookup (ap=) at >> /usr/src/sys/kern/vfs_cache.c:1081 >> #11 0xffffffff80935777 in VOP_LOOKUP_APV (vop=, a=> optimized out>) at vnode_if.c:127 >> #12 0xffffffff8061cdf1 in lookup (ndp=) at vnode_if.h:54 >> #13 0xffffffff8061c492 in namei (ndp=) at >> /usr/src/sys/kern/vfs_lookup.c:306 >> #14 0xffffffff80509395 in kern_execve (td=, args=> optimized out>, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:443 >> #15 0xffffffff80508ccc in sys_execve (td=0xfffff80065f4e500, >> uap=0xfffffe01016b8b80) at /usr/src/sys/kern/kern_exec.c:218 >> #16 0xffffffff808d449e in amd64_syscall (td=, traced=0) at >> subr_syscall.c:135 >> #17 0xffffffff808b7ddb in Xfast_syscall () at >> /usr/src/sys/amd64/amd64/exception.S:396 > > This 2 threads are innd processes. In core.txt.4: > > 8 14789 29165 0 24 4 40040 6612 zfs DN - 0:00.00 [innd] > 8 29165 1 0 20 0 42496 6888 select Ds - 0:01.33 [innd] > 8 49778 29165 0 24 4 40040 6900 zfs DN - 0:00.00 [innd] > 8 82034 29165 0 24 4 132 0 zfs DN - 0:00.00 [innd] > > the corresponding info treads are: > > 687 Thread 101243 (PID=49778: innd) sched_switch (td=0xfffff800b642b500, > newtd=0xfffff8000285ea00, flags=) at > /usr/src/sys/kern/sched_ule.c:1973 > 681 Thread 101147 (PID=14789: innd) sched_switch (td=0xfffff80065f4e500, > newtd=0xfffff8000285f000, flags=) at > /usr/src/sys/kern/sched_ule.c:1973 > 669 Thread 101250 (PID=82034: innd) sched_switch (td=0xfffff800b6429000, > newtd=0xfffff8000285ea00, flags=) at > /usr/src/sys/kern/sched_ule.c:1973 > 665 Thread 101262 (PID=29165: innd) sched_switch (td=0xfffff800b6b54a00, > newtd=0xfffff8000285ea00, flags=) at > /usr/src/sys/kern/sched_ule.c:1973 > > So your missing tread must be 101250: Oh, I missed this thread... But it looks like another LK_EXCLUSIVE contender rather than the lock owner. > (kgdb) tid 101250 > [Switching to thread 669 (Thread 101250)]#0 sched_switch > (td=0xfffff800b6429000, newtd=0xfffff8000285ea00, > flags=) at /usr/src/sys/kern/sched_ule.c:1973 > 1973 cpuid = PCPU_GET(cpuid); > Current language: auto; currently minimal > (kgdb) bt > #0 sched_switch (td=0xfffff800b6429000, newtd=0xfffff8000285ea00, flags= optimized out>) > at /usr/src/sys/kern/sched_ule.c:1973 > #1 0xffffffff80561ae2 in mi_switch (flags=, newtd=0x0) at > /usr/src/sys/kern/kern_synch.c:455 > #2 0xffffffff805ae8da in sleepq_wait (wchan=0x0, pri=0) at > /usr/src/sys/kern/subr_sleepqueue.c:646 > #3 0xffffffff8052f854 in sleeplk (lk=, flags= optimized out>, ilk=, > wmesg=0xffffffff813be535 "zfs", pri=, timo=51) at > /usr/src/sys/kern/kern_lock.c:222 > #4 0xffffffff8052f39d in __lockmgr_args (lk=, flags= optimized out>, ilk=, > wmesg=, pri=, timo= optimized out>, file=, > line=) at /usr/src/sys/kern/kern_lock.c:958 > #5 0xffffffff80616a8c in vop_stdlock (ap=) at lockmgr.h:98 > #6 0xffffffff8093784d in VOP_LOCK1_APV (vop=, a= optimized out>) at vnode_if.c:2087 > #7 0xffffffff8063c5b3 in _vn_lock (vp=, flags=525312, > file=, line=) > at vnode_if.h:859 > #8 0xffffffff804e16cf in exec_elf64_imgact (imgp=) at > /usr/src/sys/kern/imgact_elf.c:899 > #9 0xffffffff8050983d in kern_execve (td=, args= optimized out>, mac_p=0x0) > at /usr/src/sys/kern/kern_exec.c:602 > #10 0xffffffff80508ccc in sys_execve (td=0xfffff800b6429000, > uap=0xfffffe010184fb80) at /usr/src/sys/kern/kern_exec.c:218 > #11 0xffffffff808d449e in amd64_syscall (td=, traced=0) at > subr_syscall.c:135 > #12 0xffffffff808b7ddb in Xfast_syscall () at > /usr/src/sys/amd64/amd64/exception.S:396 > #13 0x000000080217edaa in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) > >> >> And the original stuck thread wants to get the lock in the shared mode. >> And there should be another thread that already holds the lock in the shared >> mode. But I am not able to identify it. I wonder if the original thread could >> be trying to get the lock recursively... >> >> It would be interesting to get more details from thread 101112. >> You can switch to it using tid command, you can use 'fr' to select frames, 'info >> local' and 'info args' to see what variables are available (not optimized out) >> and the you can print any that look interesting. It would be nice to get a file >> path and a directory vnode where the lookup is called. >> >> Thank you. >> > I find nothing interresting: > > (kgdb) fr 15 [snip] Could you please show 'info local' in frame 14? I expected that 'nd' variable would be defined there and it may contain some useful information. > I also try to get information from the execve of the other treads: > > for tid 101250: > (kgdb) fr 10 > #10 0xffffffff80508ccc in sys_execve (td=0xfffff800b6429000, > uap=0xfffffe010184fb80) at /usr/src/sys/kern/kern_exec.c:218 > 218 error = kern_execve(td, &args, NULL); > (kgdb) print *uap > $4 = {fname_l_ = 0xfffffe010184fb80 "`\220\217\002\b", fname = 0x8028f9060 >
, > fname_r_ = 0xfffffe010184fb88 "`¶ÿÿÿ\177", argv_l_ = 0xfffffe010184fb88 > "`¶ÿÿÿ\177", argv = 0x7fffffffb660, > argv_r_ = 0xfffffe010184fb90 "\bÜÿÿÿ\177", envv_l_ = 0xfffffe010184fb90 > "\bÜÿÿÿ\177", envv = 0x7fffffffdc08, > envv_r_ = 0xfffffe010184fb98 ""} > (kgdb) > > for tid 101243: > > (kgdb) f 15 > #15 0xffffffff80508ccc in sys_execve (td=0xfffff800b642b500, > uap=0xfffffe010182cb80) at /usr/src/sys/kern/kern_exec.c:218 > 218 error = kern_execve(td, &args, NULL); > (kgdb) print *uap > $5 = {fname_l_ = 0xfffffe010182cb80 "ÀÏ\205\002\b", fname = 0x80285cfc0
0x80285cfc0 out of bounds>, > fname_r_ = 0xfffffe010182cb88 "`¶ÿÿÿ\177", argv_l_ = 0xfffffe010182cb88 > "`¶ÿÿÿ\177", argv = 0x7fffffffb660, > argv_r_ = 0xfffffe010182cb90 "\bÜÿÿÿ\177", envv_l_ = 0xfffffe010182cb90 > "\bÜÿÿÿ\177", envv = 0x7fffffffdc08, > envv_r_ = 0xfffffe010182cb98 ""} > (kgdb) I think that you see garbage in those structures because they contain pointers to userland data. Hmm, I've just noticed another interesting thread: Thread 668 (Thread 101245): #0 sched_switch (td=0xfffff800b642aa00, newtd=0xfffff8000285f000, flags=) at /usr/src/sys/kern/sched_ule.c:1973 #1 0xffffffff80561ae2 in mi_switch (flags=, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:455 #2 0xffffffff805ae8da in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:646 #3 0xffffffff805614b1 in _sleep (ident=, lock=, priority=, wmesg=0xffffffff809c51bc "vmpfw", sbt=0, pr=, flags=) at /usr/src/sys/kern/kern_synch.c:229 #4 0xffffffff8089d1c1 in vm_page_busy_sleep (m=0xfffff800df68cd40, wmesg=) at /usr/src/sys/vm/vm_page.c:753 #5 0xffffffff8089dd4d in vm_page_sleep_if_busy (m=0xfffff800df68cd40, msg=0xffffffff809c51bc "vmpfw") at /usr/src/sys/vm/vm_page.c:1086 #6 0xffffffff80886be9 in vm_fault_hold (map=, vaddr=, fault_type=4 '\004', fault_flags=0, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:495 #7 0xffffffff80885448 in vm_fault (map=0xfffff80011d66000, vaddr=, fault_type=4 '\004', fault_flags=) at /usr/src/sys/vm/vm_fault.c:273 #8 0xffffffff808d3c49 in trap_pfault (frame=0xfffffe0101836c00, usermode=1) at /usr/src/sys/amd64/amd64/trap.c:741 #9 0xffffffff808d3386 in trap (frame=0xfffffe0101836c00) at /usr/src/sys/amd64/amd64/trap.c:333 #10 0xffffffff808b7af1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 I strongly suspect that this is thread that we were looking for. I think that it has the vnode lock in the shared mode while trying to fault in a page. Could you please check that by going to frame 6 and printing 'fs' and '*fs.vp'? It'd be interesting to understand why this thread is waiting here. So, please also print '*fs.m' and '*fs.object'. Thanks! -- Andriy Gapon