From owner-freebsd-fs@freebsd.org Tue Jan 2 06:19:01 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E9598EB5CB0 for ; Tue, 2 Jan 2018 06:19:01 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AEFCF714C9 for ; Tue, 2 Jan 2018 06:19:01 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.15.2/8.15.2) with ESMTP id w026IxHm059198 for ; Tue, 2 Jan 2018 01:18:59 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.15.2/8.14.4/Submit) id w026Ix4u059197; Tue, 2 Jan 2018 01:18:59 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <23115.9299.190414.482629@hergotha.csail.mit.edu> Date: Tue, 2 Jan 2018 01:18:59 -0500 From: Garrett Wollman To: freebsd-fs@freebsd.org Subject: Something holding z_teardown_inactive_lock way too long X-Mailer: VM 8.2.0b under 25.3.1 (amd64-portbld-freebsd10.3) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (hergotha.csail.mit.edu [127.0.0.1]); Tue, 02 Jan 2018 01:18:59 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED, HEADER_FROM_DIFFERENT_DOMAINS autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on hergotha.csail.mit.edu X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Jan 2018 06:19:02 -0000 I recently upgraded all of our NFS servers to 11.1 after a year of successful operation on 10.3 and (apparently) successful testing of a couple of new servers. But right now I'm seeing that there is something on one of my server (maybe more) that is holding z_teardown_inactive_lock for far, far, far too long, causing alerts to be raised. This server was first booted in 11.1 last Wednesday, and when it's sad, I see this: load: 0.18 cmd: ls 10408 [zfsvfs->z_teardown_inactive_lock] 28.89r 0.00u 0.00s 0% 2576k load: 0.10 cmd: ls 10408 [zfsvfs->z_teardown_inactive_lock] 67.43r 0.00u 0.00s 0% 2576k load: 0.06 cmd: ls 10408 [zfsvfs->z_teardown_inactive_lock] 89.75r 0.00u 0.00s 0% 2576k load: 0.13 cmd: ls 10408 [zfsvfs->z_teardown_inactive_lock] 153.10r 0.00u 0.00s 0% 2576k load: 0.06 cmd: ls 10408 [zfsvfs->z_teardown_inactive_lock] 196.55r 0.00u 0.00s 0% 2576k load: 0.14 cmd: ls 10408 [zfsvfs->z_teardown_inactive_lock] 243.63r 0.00u 0.00s 0% 2576k On the console, it seems like I can still run procstat -kk: 10408 101158 ls - mi_switch+0xe5 sleepq_wait+0x3a _sx_slock_hard+0x334 zfs_freebsd_reclaim+0x3c VOP_RECLAIM_APV+0x89 vgonel+0x21c vnlru_free_locked+0x22c getnewvnode_reserve+0x77 zfs_zget+0x24 zfs_dirent_lookup+0x162 zfs_dirlook+0x77 zfs_lookup+0x44a zfs_freebsd_lookup+0x6d VOP_CACHEDLOOKUP_APV+0x83 vfs_cache_lookup+0xd6 VOP_LOOKUP_APV+0x83 lookup+0x701 namei+0x486 although it did previously hang when I was logging in. Does this sound familiar to anyone? -GAWollman