Date: Thu, 19 Mar 2015 20:16:23 -0400 From: Garrett Wollman <wollman@csail.mit.edu> To: freebsd-fs@freebsd.org Subject: Low-vnode deadlock Message-ID: <21771.26327.65535.250135@khavrinen.csail.mit.edu>
next in thread | raw e-mail | index | archive | help
As I've previously posted, I've been doing some testing with the SPEC SFS 2014 benchmark. One of the workloads, SWBUILD, is intended to be "metadata intensive". While watching it in operation the other day, I noticed that the vnlru kthread ends up taking a large amount of CPU, indicating that the system is recycling vnodes at a very high rate. In previous benchmark runs, I've also found that this workload tends to deadlock the machine, although I haven't identified exactly how. Usually this deadlock occurs around a load value ("business metric") of 40 to 50 in the benchmark, and even when there is no deadlock, the benchmark run is counted as a failure as the system can't maintain the required op rate. As a test, I increased kern.maxvnodes to 20 million. While vnlru still gets substantial CPU, and the system is thrashing like crazy, it's still able to successfully complete benchmark runs without either deadlock or missing the iops target, at least up to a load value of 65. I'm still trying to find the point at which it falls over under this configuration. The system 5-minute load average peaks over 100 while the benchmark is running. (There are 5 benchmark processes for each unit of load, but they sleep to maintain the desired operation rate.) I will be interested to see how much of an effect this has when I move from benchmarking the server itself to running benchmarks over NFS, and the benchmark processes are no longer competing with the rest of the system for main memory. Ultimately these results will be published in some forum, but I haven't figured out exactly where yet. -GAWollman
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21771.26327.65535.250135>