From owner-freebsd-fs@freebsd.org Thu Jan 10 21:56:55 2019 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D4F72149A3BF for ; Thu, 10 Jan 2019 21:56:54 +0000 (UTC) (envelope-from wollman@khavrinen.csail.mit.edu) Received: from khavrinen.csail.mit.edu (khavrinen.csail.mit.edu [IPv6:2603:400a:0:7ec::801e:1c14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "khavrinen.csail.mit.edu", Issuer "Client CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 71B4571568 for ; Thu, 10 Jan 2019 21:56:53 +0000 (UTC) (envelope-from wollman@khavrinen.csail.mit.edu) Received: from khavrinen.csail.mit.edu (localhost [127.0.0.1]) by khavrinen.csail.mit.edu (8.15.2/8.15.2) with ESMTPS id x0ALuqoJ022962 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL CN=khavrinen.csail.mit.edu issuer=Client+20CA) for ; Thu, 10 Jan 2019 16:56:52 -0500 (EST) (envelope-from wollman@khavrinen.csail.mit.edu) Received: (from wollman@localhost) by khavrinen.csail.mit.edu (8.15.2/8.15.2/Submit) id x0ALuqAc022961; Thu, 10 Jan 2019 16:56:52 -0500 (EST) (envelope-from wollman) Message-ID: <23607.49060.41947.512101@khavrinen.csail.mit.edu> Date: Thu, 10 Jan 2019 16:56:52 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: Garrett Wollman To: freebsd-fs@freebsd.org Subject: Still having NFS issues with at-quota ZFS datasets X-Mailer: VM 8.2.0b under 26.1 (amd64-portbld-freebsd11.2) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (khavrinen.csail.mit.edu [127.0.0.1]); Thu, 10 Jan 2019 16:56:52 -0500 (EST) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (khavrinen.csail.mit.edu [127.0.0.1]); Thu, 10 Jan 2019 16:54:56 -0500 (EST) X-Rspamd-Queue-Id: 71B4571568 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dmarc=pass (policy=none) header.from=csail.mit.edu; spf=pass (mx1.freebsd.org: domain of wollman@khavrinen.csail.mit.edu designates 2603:400a:0:7ec::801e:1c14 as permitted sender) smtp.mailfrom=wollman@khavrinen.csail.mit.edu X-Spamd-Result: default: False [-2.79 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.99)[-0.995,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-0.99)[-0.994,0]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; MX_GOOD(-0.01)[khavrinen.csail.mit.edu,incoming.csail.mit.edu]; DMARC_POLICY_ALLOW(-0.50)[csail.mit.edu,none]; NEURAL_HAM_SHORT(-0.24)[-0.237,0]; IP_SCORE(-0.06)[asn: 3(-0.21), country: US(-0.08)]; FORGED_SENDER(0.30)[wollman@csail.mit.edu,wollman@khavrinen.csail.mit.edu]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:3, ipnet:2603:400a::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[wollman@csail.mit.edu, wollman@khavrinen.csail.mit.edu] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jan 2019 21:56:55 -0000 I thought that this issue was resolved by r326070 (r326427 in 11-STABLE) but it seems not. When NFS clients write into a ZFS dataset which is at quota, all NFS server processing stops: existing client socket buffers fill, no new client TCP connections are accepted, and UDP RPCs are dropped. At the time this is happening, numerous NFS server threads are idle and no auto-scaling is taking place. Here's what the nfsd kernel stacks look like: # procstat -kk 1022 PID TID COMM TDNAME KSTACK 1022 101105 nfsd nfsd: service mi_switch+0xe6 sleepq_catch_signals+0x40c sleepq_wait_sig+0xf _cv_wait_sig+0x167 svc_run_internal+0x586 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 101116 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 101505 nfsd nfsd: service mi_switch+0xe6 sleepq_catch_signals+0x40c sleepq_wait_sig+0xf _cv_wait_sig+0x167 svc_run_internal+0x586 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 101551 nfsd nfsd: service mi_switch+0xe6 sleepq_catch_signals+0x40c sleepq_wait_sig+0xf _cv_wait_sig+0x167 svc_run_internal+0x586 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 101556 nfsd nfsd: service mi_switch+0xe6 sleepq_catch_signals+0x40c sleepq_wait_sig+0xf _cv_wait_sig+0x167 svc_run_internal+0x586 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 101642 nfsd nfsd: service mi_switch+0xe6 sleepq_catch_signals+0x40c sleepq_wait_sig+0xf _cv_wait_sig+0x167 svc_run_internal+0x586 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 101769 nfsd nfsd: master mi_switch+0xe6 sleepq_catch_signals+0x40c sleepq_timedwait_sig+0x14 _cv_timedwait_sig_sbt+0x18c svc_run_internal+0x3e1 svc_run+0x176 nfsrvd_nfsd+0x22b nfssvc_nfsd+0x1d6 sys_nfssvc+0xba amd64_syscall+0xa38 fast_syscall_common+0x101 1022 101882 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102010 nfsd nfsd: service mi_switch+0xe6 sleepq_catch_signals+0x40c sleepq_wait_sig+0xf _cv_wait_sig+0x167 svc_run_internal+0x586 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102058 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102073 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102080 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102092 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102353 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102388 nfsd nfsd: service mi_switch+0xe6 sleepq_catch_signals+0x40c sleepq_wait_sig+0xf _cv_wait_sig+0x167 svc_run_internal+0x586 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102439 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102570 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102606 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e txg_wait_open+0xa5 dmu_tx_assign+0x48 zfs_freebsd_write+0x57a VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102613 nfsd nfsd: service mi_switch+0xe6 sleepq_catch_signals+0x40c sleepq_wait_sig+0xf _cv_wait_sig+0x167 svc_run_internal+0x586 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe 1022 102683 nfsd nfsd: service mi_switch+0xe6 sleepq_wait+0x2c _cv_wait+0x16e zfs_range_lock+0xbc zfs_freebsd_write+0x30d VOP_WRITE_APV+0x103 nfsvno_write+0x12b nfsrvd_write+0x4a5 nfsrvd_dorpc+0x621 nfssvc_program+0x557 svc_run_internal+0xe09 svc_thread_start+0xb fork_exit+0x83 fork_trampoline+0xe You can see that most of the threads are sleeping in txg_wait_open() but one is in zfs_range_lock() and the rest are just waiting to service incoming RPCs (which are not getting dispatched). The situation immediately clears up when the quota on the dataset is raised sufficiently to permit the pending writes to complete. (Obviously this is only possible on a pool that has sufficient free space to do so, as this one did.) -GAWollman