From owner-freebsd-fs@FreeBSD.ORG Fri Mar 20 00:16:25 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CCC41B7A for ; Fri, 20 Mar 2015 00:16:25 +0000 (UTC) Received: from khavrinen.csail.mit.edu (khavrinen.csail.mit.edu [IPv6:2001:470:8b2d:1e1c:21b:21ff:feb8:d7b0]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "khavrinen.csail.mit.edu", Issuer "Client CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 8580AF00 for ; Fri, 20 Mar 2015 00:16:25 +0000 (UTC) Received: from khavrinen.csail.mit.edu (localhost [127.0.0.1]) by khavrinen.csail.mit.edu (8.14.9/8.14.9) with ESMTP id t2K0GNmi026001 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL CN=khavrinen.csail.mit.edu issuer=Client+20CA) for ; Thu, 19 Mar 2015 20:16:23 -0400 (EDT) (envelope-from wollman@khavrinen.csail.mit.edu) Received: (from wollman@localhost) by khavrinen.csail.mit.edu (8.14.9/8.14.9/Submit) id t2K0GNci025998; Thu, 19 Mar 2015 20:16:23 -0400 (EDT) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21771.26327.65535.250135@khavrinen.csail.mit.edu> Date: Thu, 19 Mar 2015 20:16:23 -0400 From: Garrett Wollman To: freebsd-fs@freebsd.org Subject: Low-vnode deadlock X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (khavrinen.csail.mit.edu [127.0.0.1]); Thu, 19 Mar 2015 20:16:23 -0400 (EDT) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Mar 2015 00:16:25 -0000 As I've previously posted, I've been doing some testing with the SPEC SFS 2014 benchmark. One of the workloads, SWBUILD, is intended to be "metadata intensive". While watching it in operation the other day, I noticed that the vnlru kthread ends up taking a large amount of CPU, indicating that the system is recycling vnodes at a very high rate. In previous benchmark runs, I've also found that this workload tends to deadlock the machine, although I haven't identified exactly how. Usually this deadlock occurs around a load value ("business metric") of 40 to 50 in the benchmark, and even when there is no deadlock, the benchmark run is counted as a failure as the system can't maintain the required op rate. As a test, I increased kern.maxvnodes to 20 million. While vnlru still gets substantial CPU, and the system is thrashing like crazy, it's still able to successfully complete benchmark runs without either deadlock or missing the iops target, at least up to a load value of 65. I'm still trying to find the point at which it falls over under this configuration. The system 5-minute load average peaks over 100 while the benchmark is running. (There are 5 benchmark processes for each unit of load, but they sleep to maintain the desired operation rate.) I will be interested to see how much of an effect this has when I move from benchmarking the server itself to running benchmarks over NFS, and the benchmark processes are no longer competing with the rest of the system for main memory. Ultimately these results will be published in some forum, but I haven't figured out exactly where yet. -GAWollman