From owner-freebsd-stable@FreeBSD.ORG Sat Aug 21 22:56:30 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8F27610656A8 for ; Sat, 21 Aug 2010 22:56:30 +0000 (UTC) (envelope-from dan@dan.emsphone.com) Received: from email1.allantgroup.com (email1.emsphone.com [199.67.51.115]) by mx1.freebsd.org (Postfix) with ESMTP id 3C93F8FC0C for ; Sat, 21 Aug 2010 22:56:29 +0000 (UTC) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by email1.allantgroup.com (8.14.0/8.14.0) with ESMTP id o7LMOTHQ097456 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Aug 2010 17:24:29 -0500 (CDT) (envelope-from dan@dan.emsphone.com) Received: from dan.emsphone.com (smmsp@localhost [127.0.0.1]) by dan.emsphone.com (8.14.4/8.14.4) with ESMTP id o7LMOTHo019385 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Aug 2010 17:24:29 -0500 (CDT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.14.4/8.14.3/Submit) id o7LMOTh9019384; Sat, 21 Aug 2010 17:24:29 -0500 (CDT) (envelope-from dan) Date: Sat, 21 Aug 2010 17:24:29 -0500 From: Dan Nelson To: Tim Bishop Message-ID: <20100821222429.GB73221@dan.emsphone.com> References: <20100821220435.GA6208@carrick-users.bishnet.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100821220435.GA6208@carrick-users.bishnet.net> X-OS: FreeBSD 8.1-PRERELEASE User-Agent: Mutt/1.5.20 (2009-06-14) X-Virus-Scanned: clamav-milter 0.96 at email1.allantgroup.com X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.0.2 (email1.allantgroup.com [199.67.51.78]); Sat, 21 Aug 2010 17:24:30 -0500 (CDT) Cc: freebsd-stable@freebsd.org Subject: Re: 8.1R ZFS almost locking up system X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Aug 2010 22:56:30 -0000 In the last episode (Aug 21), Tim Bishop said: > I've had a problem on a FreeBSD 8.1R system for a few weeks. It seems > that ZFS gets in to an almost unresponsive state. Last time it did it > (two weeks ago) I couldn't even log in, although the system was up, this > time I could manage a reboot but couldn't stop any applications (they > were likely hanging on I/O). Could your pool be very close to full? Zfs will throttle itself when it's almost out of disk space. I know it's "saved" me from filling up my filesystems a couple times :) > A few items from top, including zfskern: > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 5 root 4 -8 - 0K 60K zio->i 0 54:38 3.47% zfskern > 91775 70 1 44 0 53040K 31144K tx->tx 1 2:11 0.00% postgres > 39661 tdb 1 44 0 55776K 32968K tx->tx 0 0:39 0.00% mutt > 14828 root 1 47 0 14636K 1572K tx->tx 1 0:03 0.00% zfs > 11188 root 1 51 0 14636K 1572K tx->tx 0 0:03 0.00% zfs > > At some point during this process my zfs snapshots have been failing to > complete: > > root 5 0.8 0.0 0 60 ?? DL 7Aug10 54:43.83 [zfskern] > root 8265 0.0 0.0 14636 1528 ?? D 10:00AM 0:03.12 zfs snapshot -r pool0@2010-08-21_10:00:01--1d > root 11188 0.0 0.1 14636 1572 ?? D 11:00AM 0:02.93 zfs snapshot -r pool0@2010-08-21_11:00:01--1d > root 14828 0.0 0.1 14636 1572 ?? D 12:00PM 0:03.04 zfs snapshot -r pool0@2010-08-21_12:00:00--1d > root 17862 0.0 0.1 14636 1572 ?? D 1:00PM 0:01.96 zfs snapshot -r pool0@2010-08-21_13:00:01--1d > root 20986 0.0 0.1 14636 1572 ?? D 2:00PM 0:02.07 zfs snapshot -r pool0@2010-08-21_14:00:01--1d procstat -k on some of these processes might help to pinpoint what part of the zfs code they're all waiting in. -- Dan Nelson dnelson@allantgroup.com