From owner-freebsd-fs@freebsd.org Wed Dec 21 06:24:19 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 171BAC8AC14; Wed, 21 Dec 2016 06:24:19 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E2BFF1297; Wed, 21 Dec 2016 06:24:18 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.15.2/8.15.2) with ESMTP id uBL6OGbG006929; Wed, 21 Dec 2016 01:24:16 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.15.2/8.14.4/Submit) id uBL6OGsJ006928; Wed, 21 Dec 2016 01:24:16 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <22618.8208.273426.702678@hergotha.csail.mit.edu> Date: Wed, 21 Dec 2016 01:24:16 -0500 From: Garrett Wollman To: freebsd-fs@freebsd.org Cc: freebsd-stable@freebsd.org, rmacklem@freebsd.org Subject: NFS server hang with backing store on ZFS and quota nearly exhausted X-Mailer: VM 8.2.0b under 25.1.1 (amd64-portbld-freebsd10.3) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.1 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 21 Dec 2016 01:24:17 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED, HEADER_FROM_DIFFERENT_DOMAINS autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on hergotha.csail.mit.edu X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Dec 2016 06:24:19 -0000 I've opened a bug about this before, which I can't cite by number because bugzilla appears to be down at the moment. But I had this problem recur tonight under otherwise idle conditions, so I was able to get a set of kernel stacks without any confounding RPC activity going on. This is on 10.2; we're not scheduled to take these servers to 10.3 until next week. Here's the "procstat -kk" output. PID TID COMM TDNAME KSTACK 1055 101965 nfsd - mi_switch+0xe1 sleepq_catch_signals+0xab sleepq_wait_sig+0xf _cv_wait_sig+0x16a seltdwait+0xae kern_select+0x8fa sys_select+0x54 amd64_syscall+0x357 Xfast_syscall+0xfb 1058 101012 nfsd nfsd: service mi_switch+0xe1 sleepq_catch_signals+0xab sleepq_wait_sig+0xf _cv_wait_sig+0x16a svc_run_internal+0x8be svc_thread_start+0xb fork_exit+0x9a fork_trampoline+0xe [Threads with the stack trace above are simply idle and waiting for incoming requests, and I've deleted the other 5 of them.] 1058 101688 nfsd nfsd: service mi_switch+0xe1 sleepq_catch_signals+0xab sleepq_timedwait_sig+0x10 _cv_timedwait_sig_sbt+0x18b svc_run_internal+0x4bd svc_thread_start+0xb fork_exit+0x9a fork_trampoline+0xe [Not sure what these threads are doing: obviously they are waiting for a condvar, but at a different spot in svc_run_internal(). I've deleted the other 7 of them.] 1058 101720 nfsd nfsd: service mi_switch+0xe1 sleepq_wait+0x3a _cv_wait+0x16d txg_wait_open+0x85 dmu_tx_wait+0x2ac dmu_tx_assign+0x48 zfs_freebsd_write+0x544 VOP_WRITE_APV+0x149 nfsvno_write+0x13e nfsrvd_write+0x496 nfsrvd_dorpc+0x6f1 nfssvc_program+0x54e svc_run_internal+0xd7b svc_thread_start+0xb fork_exit+0x9a fork_trampoline+0xe 1058 102015 nfsd nfsd: master mi_switch+0xe1 sleepq_wait+0x3a _cv_wait+0x16d txg_wait_open+0x85 dmu_tx_wait+0x2ac dmu_tx_assign+0x48 zfs_freebsd_write+0x544 VOP_WRITE_APV+0x149 nfsvno_write+0x13e nfsrvd_write+0x496 nfsrvd_dorpc+0x6f1 nfssvc_program+0x54e svc_run_internal+0xd7b svc_run+0x1de nfsrvd_nfsd+0x242 nfssvc_nfsd+0x107 sys_nfssvc+0x9c amd64_syscall+0x357 Then there are these two threads, both servicing WRITE RPCs, both sleeping deep inside the ZFS code. Note that one of them is the "master" krpc thread in this service pool; I don't know if this accounts for the fact that requests are not getting processed even though plenty of idle threads exist. (Note that zfs_write() does not appear in the stack due to tail-call optimization.) I don't know the ZFS code well enough to understand what running out of quota has to do with this situation (you'd think it would just return immediately with [EDQUOT]) but perhaps it matters that the clients are not well-behaved and that the filesystem is often almost at quota but not quite there yet. -GAWollman