From owner-freebsd-fs@freebsd.org Mon Jul 6 21:23:23 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0C8A1995056 for ; Mon, 6 Jul 2015 21:23:23 +0000 (UTC) (envelope-from daved@nostrum.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id E88C91EC1 for ; Mon, 6 Jul 2015 21:23:22 +0000 (UTC) (envelope-from daved@nostrum.com) Received: by mailman.ysv.freebsd.org (Postfix) id E4E84995055; Mon, 6 Jul 2015 21:23:22 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E4481995053 for ; Mon, 6 Jul 2015 21:23:22 +0000 (UTC) (envelope-from daved@nostrum.com) Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C0BF21EC0 for ; Mon, 6 Jul 2015 21:23:22 +0000 (UTC) (envelope-from daved@nostrum.com) Received: from [10.1.12.128] (vpn.net.tamu.edu [128.194.177.117]) (authenticated bits=0) by nostrum.com (8.15.2/8.14.9) with ESMTPSA id t66LNLAC049007 (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128 verify=NO) for ; Mon, 6 Jul 2015 16:23:22 -0500 (CDT) (envelope-from daved@nostrum.com) X-Authentication-Warning: raven.nostrum.com: Host vpn.net.tamu.edu [128.194.177.117] claimed to be [10.1.12.128] From: Dave Duchscher Subject: ZFS system lockup Message-Id: <1BCFA515-BF3D-4E64-B826-BA475B13E770@nostrum.com> Date: Mon, 6 Jul 2015 16:23:16 -0500 To: fs@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2102\)) X-Mailer: Apple Mail (2.2102) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Jul 2015 21:23:23 -0000 In the process of diagnosing an IO performance problem with our virtual = environment, we ran into FreeBSD instances used in testing locking up = and needing to be reset. Moving to real hardware and running the same = tests, we are able to reproduce the lockup. We are testing using fio running a few read and write tests over and = over again. Watching via top, the system locks up and last update from = top is reporting wired memory has taking all the memory (2G in the = system, top shows1947M wired). ARC size at the time of the latest lockup = was around 437M. I can keep the system from locking if I reduce the = maximum ARC size to 512M and wired memory floats around 1G. Setting = maximum ARC 768M or higher and we get consistent lockups after running = for a few hours. What is using this wired memory? Is there a way to keep wired memory under control with ZFS besides = shrinking the ARC cache? Is there any guidance on how much wired memory will be used for various = ARC sizes? Is 2G just too little memory to run ZFS? We understand that the maximum ARC size will need to tuned in some cases = but shrinking it down to 512M seems low. This test hardware has a single 250G disk and 2G of RAM. OS is FreeBSD = 10.1 Release. Upgrading the system to stable and saw similar results. = Currently, the system is running 10.1 Release since that is what is used = elsewhere. We have seen a lockup on one of our database nodes which has 20G of RAM = which we thought was caused by a SAN switch on our VM system. Now we = are not so sure. -- Dave