From owner-freebsd-stable@FreeBSD.ORG Thu Nov 13 10:26:47 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BAD4B106564A; Thu, 13 Nov 2008 10:26:47 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.terabit.net.ua (mail.terabit.net.ua [195.137.202.147]) by mx1.freebsd.org (Postfix) with ESMTP id 5DB248FC1E; Thu, 13 Nov 2008 10:26:47 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from skuns.zoral.com.ua ([91.193.166.194] helo=mail.zoral.com.ua) by mail.terabit.net.ua with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63 (FreeBSD)) (envelope-from ) id 1L0ZPS-000NPo-0R; Thu, 13 Nov 2008 12:26:46 +0200 Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id mADAQg9a090506 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 13 Nov 2008 12:26:42 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id mADAQgPo027017; Thu, 13 Nov 2008 12:26:42 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id mADAQgtT027016; Thu, 13 Nov 2008 12:26:42 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 13 Nov 2008 12:26:42 +0200 From: Kostik Belousov To: Jeremy Chadwick Message-ID: <20081113102642.GQ47073@deviant.kiev.zoral.com.ua> References: <20081112175826.GD26195@carrick.bishnet.net> <20081112194735.GK47073@deviant.kiev.zoral.com.ua> <20081113004102.GD24360@carrick.bishnet.net> <20081113044200.GA10419@icarus.home.lan> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="xYeFQzU4VZLrHqxU" Content-Disposition: inline In-Reply-To: <20081113044200.GA10419@icarus.home.lan> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: ClamAV version 0.93.3, clamav-milter version 0.93.3 on skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua X-Virus-Scanned: mail.terabit.net.ua 1L0ZPS-000NPo-0R 1f183d59b227644d4f32b9ddb93daae9 X-Terabit: YES Cc: Tim Bishop , freebsd-stable@freebsd.org Subject: Re: System deadlock when using mksnap_ffs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2008 10:26:47 -0000 --xYeFQzU4VZLrHqxU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote: > On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote: > > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote: > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote: > > > > I've been playing around with snapshots lately but I've got a probl= em on > > > > one of my servers running 7-STABLE amd64: > > > >=20 > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 1= 0 20:49:51 GMT 2008 tdb@paladin:/usr/obj/usr/src/sys/PALADIN amd64 > > > >=20 > > > > I run the mksnap_ffs command to take the snapshot and some time lat= er > > > > the system completely freezes up: > > > >=20 > > > > paladin# cd /u2/.snap/ > > > > paladin# mksnap_ffs /u2 test.1 > > > >=20 > > > > It only happens on this one filesystem, though, which might be to do > > > > with its size. It's not over the 2TB marker, but it's pretty close.= It's > > > > also backed by a hardware RAID system, although a smaller filesyste= m on > > > > the same RAID has no issues. > > > >=20 > > > > Filesystem 1K-blocks Used Avail Capacity Mounted on > > > > /dev/da0s1a 2078881084 921821396 990749202 48% /u2 > > > >=20 > > > > To clarify "completely freezes up": unresponsive to all services ov= er > > > > the network, except ping. On the console I can switch between the t= tys, > > > > but none of them respond. The only way out is to hit the reset butt= on. > > >=20 > > > You need to provide information described in the > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/= kerneldebug.html > > > and especially > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/= kerneldebug-deadlocks.html > >=20 > > Ok, I've done that, and removed the patch that seemed to fix things. > >=20 > > The first thing I notice after doing this on the console is that I can > > still ctrl+t the process: > >=20 > > load: 0.14 cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k > >=20 > > But the top and ps I left running on other ttys have all stopped > > responding. >=20 > Then in my book, the patch didn't fix anything. :-) The system is > still "deadlocking"; snapshot generation **should not** wedge the system > hard like this. You systematically mix two completely different issues: - first one is the _deadlock_ experienced by Tim; - second one is the slowdown during snapshot creation. In fact, I may count third, where dump itself hangs, as a usermode process, but kernel still normally operates. Patch posted should fix or paper over the first issue for practical means. Third issue most likely fixed by the subr_sleepqueue race fix. --xYeFQzU4VZLrHqxU Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAkkcAOEACgkQC3+MBN1Mb4gBLgCeJpvjH91HS+aZkdvC9fg6gAqF m6MAoK4f2shdnDrmgyu7mj0xfptk5iSM =hB/Y -----END PGP SIGNATURE----- --xYeFQzU4VZLrHqxU--