From owner-freebsd-fs@FreeBSD.ORG Thu Oct 16 22:57:11 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6FCE7659 for ; Thu, 16 Oct 2014 22:57:11 +0000 (UTC) Received: from caida.org (rommie.caida.org [192.172.226.78]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 41E92D04 for ; Thu, 16 Oct 2014 22:57:10 +0000 (UTC) Message-ID: <544047B4.8000402@caida.org> Date: Thu, 16 Oct 2014 15:33:24 -0700 From: Daniel Andersen User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Process enters unkillable state and somewhat wedges zfs References: <53F25402.1020907@caida.org> <201408271639.09352.jhb@freebsd.org> <53FE4C9F.7030406@caida.org> <5842681.mjgMD2kESs@ralph.baldwin.cx> <5425DF0E.4040406@caida.org> In-Reply-To: <5425DF0E.4040406@caida.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Oct 2014 22:57:11 -0000 A further update: we almost went 3 weeks without a hang after moving /home back off zfs. Then someone used sudo in a nullfs <-> zfs directory and we got the same thing. Interesting that it's almost always sudo. Only other hangs have been when a program coredumps. ( by nullfs <-> zfs directory, I mean we have a zfs directory /tank/foo and a nullfs mount /data/foo that people use to access ) Dan On 09/26/2014 02:47 PM, Daniel Andersen wrote: > Okay, we were finally able to collect a trace on this. I took two, just in > case. basically did a tr , continue, go back into ddb and did another. > > Tracing pid 89483 tid 101168 td 0xfffff8048e301920 > cpustop_handler() at cpustop_handler+0x28/frame 0xfffffe2f395e1ce0 > ipi_nmi_handler() at ipi_nmi_handler+0x3f/frame 0xfffffe2f395e1d00 > trap() at trap+0x42/frame 0xfffffe2f395e1f20 > nmi_calltrap() at nmi_calltrap+0x8/frame 0xfffffe2f395e1f20 > --- trap 0x13, rip = 0xffffffff808aada0, rsp = 0xfffffe2f395e1fe0, rbp = 0xfffffe2ffca2bf60 --- > __lockmgr_args() at __lockmgr_args+0x30/frame 0xfffffe2ffca2bf60 > vop_stdlock() at vop_stdlock+0x3c/frame 0xfffffe2ffca2bf80 > VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9d/frame 0xfffffe2ffca2bfb0 > _vn_lock() at _vn_lock+0x43/frame 0xfffffe2ffca2c010 > zfs_lookup() at zfs_lookup+0x3a1/frame 0xfffffe2ffca2c0a0 > zfs_freebsd_lookup() at zfs_freebsd_lookup+0x6d/frame 0xfffffe2ffca2c1e0 > VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x92/frame 0xfffffe2ffca2c210 > vfs_cache_lookup() at vfs_cache_lookup+0xcf/frame 0xfffffe2ffca2c260 > VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x92/frame 0xfffffe2ffca2c290 > null_lookup() at null_lookup+0x8b/frame 0xfffffe2ffca2c300 > VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x92/frame 0xfffffe2ffca2c330 > lookup() at lookup+0x58b/frame 0xfffffe2ffca2c3b0 > namei() at namei+0x504/frame 0xfffffe2ffca2c480 > vn_open_cred() at vn_open_cred+0x232/frame 0xfffffe2ffca2c5d0 > vop_stdvptocnp() at vop_stdvptocnp+0x17d/frame 0xfffffe2ffca2c910 > null_vptocnp() at null_vptocnp+0x2b/frame 0xfffffe2ffca2c970 > VOP_VPTOCNP_APV() at VOP_VPTOCNP_APV+0x98/frame 0xfffffe2ffca2c9a0 > vn_vptocnp_locked() at vn_vptocnp_locked+0x113/frame 0xfffffe2ffca2ca20 > vn_fullpath1() at vn_fullpath1+0x1b2/frame 0xfffffe2ffca2ca80 > kern___getcwd() at kern___getcwd+0x115/frame 0xfffffe2ffca2cae0 > amd64_syscall() at amd64_syscall+0x357/frame 0xfffffe2ffca2cbf0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe2ffca2cbf0 > --- syscall (326, FreeBSD ELF64, sys___getcwd), rip = 0x2000cd8cda, rsp = 0x7fffffffd358, rbp = 0x7fffffffd440 --- > > > > Tracing pid 89483 tid 101168 td 0xfffff8048e301920 > cpustop_handler() at cpustop_handler+0x28/frame 0xfffffe2f395e1ce0 > ipi_nmi_handler() at ipi_nmi_handler+0x3f/frame 0xfffffe2f395e1d00 > trap() at trap+0x42/frame 0xfffffe2f395e1f20 > nmi_calltrap() at nmi_calltrap+0x8/frame 0xfffffe2f395e1f20 > --- trap 0x13, rip = 0xffffffff80e6110e, rsp = 0xfffffe2f395e1fe0, rbp = 0xfffffe2ffca2bfb0 --- > VOP_LOCK1_APV() at VOP_LOCK1_APV+0xae/frame 0xfffffe2ffca2bfb0 > _vn_lock() at _vn_lock+0x43/frame 0xfffffe2ffca2c010 > zfs_lookup() at zfs_lookup+0x3a1/frame 0xfffffe2ffca2c0a0 > zfs_freebsd_lookup() at zfs_freebsd_lookup+0x6d/frame 0xfffffe2ffca2c1e0 > VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x92/frame 0xfffffe2ffca2c210 > vfs_cache_lookup() at vfs_cache_lookup+0xcf/frame 0xfffffe2ffca2c260 > VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x92/frame 0xfffffe2ffca2c290 > null_lookup() at null_lookup+0x8b/frame 0xfffffe2ffca2c300 > VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x92/frame 0xfffffe2ffca2c330 > lookup() at lookup+0x58b/frame 0xfffffe2ffca2c3b0 > namei() at namei+0x504/frame 0xfffffe2ffca2c480 > vn_open_cred() at vn_open_cred+0x232/frame 0xfffffe2ffca2c5d0 > vop_stdvptocnp() at vop_stdvptocnp+0x17d/frame 0xfffffe2ffca2c910 > null_vptocnp() at null_vptocnp+0x2b/frame 0xfffffe2ffca2c970 > VOP_VPTOCNP_APV() at VOP_VPTOCNP_APV+0x98/frame 0xfffffe2ffca2c9a0 > vn_vptocnp_locked() at vn_vptocnp_locked+0x113/frame 0xfffffe2ffca2ca20 > vn_fullpath1() at vn_fullpath1+0x1b2/frame 0xfffffe2ffca2ca80 > kern___getcwd() at kern___getcwd+0x115/frame 0xfffffe2ffca2cae0 > amd64_syscall() at amd64_syscall+0x357/frame 0xfffffe2ffca2cbf0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe2ffca2cbf0 > --- syscall (326, FreeBSD ELF64, sys___getcwd), rip = 0x2000cd8cda, rsp = 0x7fffffffd358, rbp = 0x7fffffffd440 --- > > > > > On 08/29/2014 11:24 AM, John Baldwin wrote: >> On Wednesday, August 27, 2014 02:24:47 PM Daniel Andersen wrote: >>> On 08/27/2014 01:39 PM, John Baldwin wrote: >>>> These are all blocked in "zfs" then. (For future reference, the 'mwchan' >>>> field that you see as 'STATE' in top or via 'ps O mwchan' is more detailed >>>> than the 'D' state.) >>>> >>>> To diagnose this further, you would need to see which thread holds the >>>> ZFS vnode lock these threads need. I have some gdb scripts you can use to >>>> do that at www.freebsd.org/~jhb/gdb/. You would want to download 'gdb6*' >>>> files from there and then do this as root: >>>> >>>> # cd /path/to/gdb/files >>>> # kgdb >>>> (kgdb) source gdb6 >>>> (kgdb) sleepchain 42335 >>>> >>>> Where '42335' is the pid of some process stuck in "zfs". >>> >>> I will keep this in mind the next time the machine wedges. Another data >>> point: the second procstat output I sent was the most recent. All the >>> processes listed there were after the fact. The process that started the >>> entire problem ( this time ) was sudo, and it only has this one entry in >>> procstat: >>> >>> 38003 102797 sudo - >>> >>> Of note, this does not appear to be blocked on zfs in anyway. 'ps' showed >>> it in 'R' state instead of 'D' ( I will be sure to use mwchan in the >>> future. ) It appeared to be pegging an entire CPU core at 100% usage, as >>> well, and was only single threaded. >> >> Well, if it is spinning in some sort of loop in the kernel while holding a >> ZFS vnode lock that could be blocking all the other threads. In that case, >> you don't need to do what I asked for above. Instead, we need to find out >> what that thread is doing. There are two ways of doing this. One is to >> force a panic via 'sysctl debug.kdb.panic=1' and then use kgdb on the >> crashdump to determine what the running thread is doing. Another option >> is to break into the DDB debugger on the console (note that you will need >> to build a custom kernel with DDB if you are on stable) and request a >> stack trace of the running process via 'tr '. Ideally you can do this >> over a serial console so you can just cut and paste the output of the trace >> into a mail. Over a video console you can either transcribe it by hand or >> take photos. >> > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >