From owner-freebsd-fs@FreeBSD.ORG Fri Oct 8 16:25:04 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AA918106566B for ; Fri, 8 Oct 2010 16:25:04 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 016328FC19 for ; Fri, 8 Oct 2010 16:25:03 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA08865; Fri, 08 Oct 2010 19:24:09 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CAF45A8.3020401@icyb.net.ua> Date: Fri, 08 Oct 2010 19:24:08 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Kai Gallasch References: <39F05641-4E46-4BE0-81CA-4DEB175A5FBE@free.de> <201010061732.o96HW2Vi005945@higson.cam.lispworks.com> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: Locked up processes after upgrade to ZFS v15 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Oct 2010 16:25:04 -0000 on 06/10/2010 21:51 Kai Gallasch said the following: > > Am 06.10.2010 um 19:32 schrieb Martin Simmons: > >>>>>>> On Wed, 6 Oct 2010 14:28:31 +0200, Kai Gallasch said: >>> >>> How can I debug this and get further information? >> >> procstat -k -k $pid will generate a backtrace (or replace $pid by -a for all >> processes). > > procstat for process 12111 (state: zfs) > sonnenkraft:~ # procstat -k -k 12111 > PID TID COMM TDNAME KSTACK > 12111 102385 httpd - mi_switch+0x21b sleepq_switch+0x123 sleepq_wait+0x4d __lockmgr_args+0x7ae vop_stdlock+0x39 VOP_LOCK1_APV+0x9b _vn_lock+0x57 vget+0x7b cache_lookup+0x4e0 vfs_cache_lookup+0xc0 VOP_LOOKUP_APV+0xb7 lookup+0x3d3 namei+0x457 vn_open_cred+0x1e3 kern_openat+0x181 syscall+0x102 Xfast_syscall+0xe2 > > procstat for process 24731 (state: zfsmrb) > # procstat -k -k 24731 > PID TID COMM TDNAME KSTACK > 24731 102273 httpd - mi_switch+0x21b sleepq_switch+0x123 sleepq_wait+0x4d _sleep+0x369 zfs_freebsd_read+0x2a6 VOP_READ_APV+0xaf vnode_pager_generic_getpages+0x3ea VOP_GETPAGES_APV+0xb5 vnode_pager_getpages+0x8c vm_fault+0x685 trap_pfault+0x128 trap+0x52c calltrap+0x8 > > In my original post I wrote that only apache httpd processes would lock up.. > This is wrong. Several other non-httpd processes also got stuck in state zfs or zfsmrb. Interesting. It's possible that TID 102385 might be waiting on a vnode lock held by TID 102273. But TID 102273 seems to be waiting on a vnode's page lock. It would be very interesting to learn what process has that page busy, for how long and why. Perhaps there is a code path that busies a page, but never un-busies it... -- Andriy Gapon