From owner-freebsd-fs@FreeBSD.ORG Mon Dec 3 18:41:34 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9C3A17B; Mon, 3 Dec 2012 18:41:34 +0000 (UTC) (envelope-from olivier777a7@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 64CC68FC18; Mon, 3 Dec 2012 18:41:34 +0000 (UTC) Received: by mail-pb0-f54.google.com with SMTP id wz12so2182036pbc.13 for ; Mon, 03 Dec 2012 10:41:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=F2irgfTg0TOHUAOL52kSmh+t9uR1Aulef3Y65iHenX8=; b=JWZmt5Wsg6U7BBYrnBaFe+VsXyLzm6MzuFzF2PKpnc8huNl0mIf5MZzJdxo6z7Tmop pFTnuQxe5le7g2iwYrg118fuFDcGZXHNkwVeRsA1XD4emi5b8fOe8dTf1paq4WEMcKsk ZvfRPx30iGjDywZvfZF7HOdw98+DoigGLwPywB1I4jMzoN4TToQfa8/5Nd5IDEHt5iCz o6kKSjxiwHBfRS6EYAWL5u33ZlJjj9ET55pJg85DZ0piYLnDu430r9zeVx+gTNTNv5hS Z0cqLH/uaHrX4vuVcKX514Vzy0AjUyMfYtuQ/Xe+pslPR7/V1uDSigLXA/PsEGVaOfAg GnQQ== MIME-Version: 1.0 Received: by 10.66.75.162 with SMTP id d2mr28097929paw.27.1354560093820; Mon, 03 Dec 2012 10:41:33 -0800 (PST) Received: by 10.66.148.136 with HTTP; Mon, 3 Dec 2012 10:41:33 -0800 (PST) Date: Mon, 3 Dec 2012 10:41:33 -0800 Message-ID: Subject: NFS/ZFS hangs after upgrading from 9.0-RELEASE to -STABLE From: olivier olivier To: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Dec 2012 18:41:34 -0000 Hi all After upgrading from 9.0-RELEASE to 9.1-PRERELEASE #0 r243679 I'm having severe problems with NFS sharing of a ZFS volume. nfsd appears to hang at random times (between once every couple hours to once every two days) while accessing a ZFS volume, and the only way I have found of resolving the problem is to reboot. The server console is sometimes still responsive during the nfsd hang, and I can read and write files to the same ZFS volume while nfsd is hung. I am pasting below the output of procstat -kk on nfsd, and details of my pool (nfsstat on the server gets hung when the problem has started occurring, and does not produce any output). The pool is v28 and was created from a bunch of volumes attached over Fibre Channel using the mpt driver. My system has a Supermicro board and 4 AMD Opteron 6274 CPUs. I did not experience any nfsd hangs with 9.0-RELEASE (same machine, essentially same configuration, same usage pattern). I would greatly appreciate any help to resolve this problem! Thank you Olivier PID TID COMM TDNAME KSTACK 1511 102751 nfsd nfsd: master mi_switch+0x186 sleepq_wait+0x42 __lockmgr_args+0x5ae vop_stdlock+0x39 VOP_LOCK1_APV+0x46 _vn_lock+0x47 zfs_fhtovp+0x338 nfsvno_fhtovp+0x87 nfsd_fhtovp+0x7a nfsrvd_dorpc+0x9cf nfssvc_program+0x447 svc_run_internal+0x687 svc_run+0x8f nfsrvd_nfsd+0x193 nfssvc_nfsd+0x9b sys_nfssvc+0x90 amd64_syscall+0x540 Xfast_syscall+0xf7 1511 102752 nfsd nfsd: service mi_switch+0x186 sleepq_wait+0x42 __lockmgr_args+0x5ae vop_stdlock+0x39 VOP_LOCK1_APV+0x46 _vn_lock+0x47 zfs_fhtovp+0x338 nfsvno_fhtovp+0x87 nfsd_fhtovp+0x7a nfsrvd_dorpc+0x9cf nfssvc_program+0x447 svc_run_internal+0x687 svc_thread_start+0xb fork_exit+0x11f fork_trampoline+0xe 1511 102753 nfsd nfsd: service mi_switch+0x186 sleepq_wait+0x42 _cv_wait+0x112 zio_wait+0x61 zil_commit+0x764 zfs_freebsd_write+0xba0 VOP_WRITE_APV+0xb2 nfsvno_write+0x14d nfsrvd_write+0x362 nfsrvd_dorpc+0x3c0 nfssvc_program+0x447 svc_run_internal+0x687 svc_thread_start+0xb fork_exit+0x11f fork_trampoline+0xe 1511 102754 nfsd nfsd: service mi_switch+0x186 sleepq_wait+0x42 _cv_wait+0x112 zio_wait+0x61 zil_commit+0x3cf zfs_freebsd_fsync+0xdc nfsvno_fsync+0x2f2 nfsrvd_commit+0xe7 nfsrvd_dorpc+0x3c0 nfssvc_program+0x447 svc_run_internal+0x687 svc_thread_start+0xb fork_exit+0x11f fork_trampoline+0xe 1511 102755 nfsd nfsd: service mi_switch+0x186 sleepq_wait+0x42 __lockmgr_args+0x5ae vop_stdlock+0x39 VOP_LOCK1_APV+0x46 _vn_lock+0x47 zfs_fhtovp+0x338 nfsvno_fhtovp+0x87 nfsd_fhtovp+0x7a nfsrvd_dorpc+0x9cf nfssvc_program+0x447 svc_run_internal+0x687 svc_thread_start+0xb fork_exit+0x11f fork_trampoline+0xe 1511 102756 nfsd nfsd: service mi_switch+0x186 sleepq_wait+0x42 _cv_wait+0x112 zil_commit+0x6d zfs_freebsd_write+0xba0 VOP_WRITE_APV+0xb2 nfsvno_write+0x14d nfsrvd_write+0x362 nfsrvd_dorpc+0x3c0 nfssvc_program+0x447 svc_run_internal+0x687 svc_thread_start+0xb fork_exit+0x11f fork_trampoline+0xe PID TID COMM TDNAME KSTACK 1507 102750 nfsd - mi_switch+0x186 sleepq_catch_signals+0x2e1 sleepq_wait_sig+0x16 _cv_wait_sig+0x12a seltdwait+0xf6 kern_select+0x6ef sys_select+0x5d amd64_syscall+0x540 Xfast_syscall+0xf7 pool: tank state: ONLINE status: The pool is formatted using a legacy on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on software that does not support feature flags. scan: scrub repaired 0 in 45h37m with 0 errors on Mon Dec 3 03:07:11 2012 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 da19 ONLINE 0 0 0 da31 ONLINE 0 0 0 da32 ONLINE 0 0 0 da33 ONLINE 0 0 0 da34 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 da20 ONLINE 0 0 0 da36 ONLINE 0 0 0 da37 ONLINE 0 0 0 da38 ONLINE 0 0 0 da39 ONLINE 0 0 0