From owner-freebsd-stable@freebsd.org Mon Nov 14 11:46:57 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 34A0CC41744 for ; Mon, 14 Nov 2016 11:46:57 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5DFC99B3; Mon, 14 Nov 2016 11:46:55 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA03430; Mon, 14 Nov 2016 13:46:54 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1c6Fiw-000COu-Fq; Mon, 14 Nov 2016 13:46:54 +0200 Subject: Re: Freebsd 11.0 RELEASE - ZFS deadlock To: Henri Hennebert , freebsd-stable@FreeBSD.org References: <0c223160-b76f-c635-bb15-4a068ba7efe7@restart.be> <43c9d4d4-1995-5626-d70a-f92a5b456629@FreeBSD.org> <9d1f9a76-5a8d-6eca-9a50-907d55099847@FreeBSD.org> <6bc95dce-31e1-3013-bfe3-7c2dd80f9d1e@restart.be> <23a66749-f138-1f1a-afae-c775f906ff37@restart.be> <8e7547ef-87f7-7fab-6f45-221e8cea1989@FreeBSD.org> <6d991cea-b420-531e-12cc-001e4aeed66b@restart.be> <67f2e8bd-bff0-f808-7557-7dabe5cad78c@FreeBSD.org> <1cb09c54-5f0e-2259-a41a-fefe76b4fe8b@restart.be> <9f20020b-e2f1-862b-c3fc-dc6ff94e301e@restart.be> <599c5a5b-aa08-2030-34f3-23ff19d09a9b@restart.be> <32686283-948a-6faf-7ded-ed8fcd23affb@FreeBSD.org> <26512d69-94c2-92da-e3ea-50aebf17e3a0@restart.be> Cc: Konstantin Belousov From: Andriy Gapon Message-ID: Date: Mon, 14 Nov 2016 13:45:58 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <26512d69-94c2-92da-e3ea-50aebf17e3a0@restart.be> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Nov 2016 11:46:57 -0000 On 14/11/2016 11:35, Henri Hennebert wrote: > > > On 11/14/2016 10:07, Andriy Gapon wrote: >> Hmm, I've just noticed another interesting thread: >> Thread 668 (Thread 101245): >> #0 sched_switch (td=0xfffff800b642aa00, newtd=0xfffff8000285f000, flags=> optimized out>) at /usr/src/sys/kern/sched_ule.c:1973 >> #1 0xffffffff80561ae2 in mi_switch (flags=, newtd=0x0) at >> /usr/src/sys/kern/kern_synch.c:455 >> #2 0xffffffff805ae8da in sleepq_wait (wchan=0x0, pri=0) at >> /usr/src/sys/kern/subr_sleepqueue.c:646 >> #3 0xffffffff805614b1 in _sleep (ident=, lock=> optimized out>, priority=, wmesg=0xffffffff809c51bc >> "vmpfw", sbt=0, pr=, flags=) at >> /usr/src/sys/kern/kern_synch.c:229 >> #4 0xffffffff8089d1c1 in vm_page_busy_sleep (m=0xfffff800df68cd40, wmesg=> optimized out>) at /usr/src/sys/vm/vm_page.c:753 >> #5 0xffffffff8089dd4d in vm_page_sleep_if_busy (m=0xfffff800df68cd40, >> msg=0xffffffff809c51bc "vmpfw") at /usr/src/sys/vm/vm_page.c:1086 >> #6 0xffffffff80886be9 in vm_fault_hold (map=, vaddr=> optimized out>, fault_type=4 '\004', fault_flags=0, m_hold=0x0) at >> /usr/src/sys/vm/vm_fault.c:495 >> #7 0xffffffff80885448 in vm_fault (map=0xfffff80011d66000, vaddr=> optimized out>, fault_type=4 '\004', fault_flags=) at >> /usr/src/sys/vm/vm_fault.c:273 >> #8 0xffffffff808d3c49 in trap_pfault (frame=0xfffffe0101836c00, usermode=1) at >> /usr/src/sys/amd64/amd64/trap.c:741 >> #9 0xffffffff808d3386 in trap (frame=0xfffffe0101836c00) at >> /usr/src/sys/amd64/amd64/trap.c:333 >> #10 0xffffffff808b7af1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 > > This tread is another program from the news system: > 668 Thread 101245 (PID=49124: innfeed) sched_switch (td=0xfffff800b642aa00, > newtd=0xfffff8000285f000, flags=) at > /usr/src/sys/kern/sched_ule.c:1973 > >> >> I strongly suspect that this is thread that we were looking for. >> I think that it has the vnode lock in the shared mode while trying to fault in a >> page. >> >> Could you please check that by going to frame 6 and printing 'fs' and '*fs.vp'? >> It'd be interesting to understand why this thread is waiting here. >> So, please also print '*fs.m' and '*fs.object'. > > No luck :-( > (kgdb) fr 6 > #6 0xffffffff80886be9 in vm_fault_hold (map=, vaddr= optimized out>, fault_type=4 '\004', > fault_flags=0, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:495 > 495 vm_page_sleep_if_busy(fs.m, "vmpfw"); > (kgdb) print fs > Cannot access memory at address 0xffff00001fa0 > (kgdb) Okay. Luckily for us, it seems that 'm' is available in frame 5. It also happens to be the first field of 'struct faultstate'. So, could you please go to frame and print '*m' and '*(struct faultstate *)m' ? -- Andriy Gapon