From owner-freebsd-current@freebsd.org Wed Oct 28 13:12:27 2015 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BE975A1F8FC for ; Wed, 28 Oct 2015 13:12:27 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from smtprelay03.ispgateway.de (smtprelay03.ispgateway.de [80.67.31.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7E3E31764 for ; Wed, 28 Oct 2015 13:12:27 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from [78.35.176.193] (helo=fabiankeil.de) by smtprelay03.ispgateway.de with esmtpsa (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.84) (envelope-from ) id 1ZrQJ4-0000Ag-DU; Wed, 28 Oct 2015 13:58:22 +0100 Date: Wed, 28 Oct 2015 13:58:21 +0100 From: Fabian Keil To: freebsd-current@freebsd.org Cc: "Steven Hartland" , Xin Li , "Alexander Motin" Subject: Re: ZFS-related panic: "possible" spa->spa_errlog_lock deadlock Message-ID: <20151028135821.0d375ec5@fabiankeil.de> In-Reply-To: <540C8039.7010309@delphij.net> References: <492dbacb.5942cc9b@fabiankeil.de> <540C66AC.8070809@delphij.net> <4fa875ba.3cc970d7@fabiankeil.de> <540C8039.7010309@delphij.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; boundary="Sig_/n/RaV4NiQBkXCK0r/1ymFqH"; protocol="application/pgp-signature" X-Df-Sender: Nzc1MDY3 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Oct 2015 13:12:27 -0000 --Sig_/n/RaV4NiQBkXCK0r/1ymFqH Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Xin Li wrote: > On 9/7/14 11:23 PM, Fabian Keil wrote: > > Xin Li wrote: > > =20 > >> On 9/7/14 9:02 PM, Fabian Keil wrote: =20 > >>> Using a kernel built from FreeBSD 11.0-CURRENT r271182 I got > >>> the following panic yesterday: > >>>=20 > >>> [...] Unread portion of the kernel message buffer: [6880] > >>> panic: deadlkres: possible deadlock detected for > >>> 0xfffff80015289490, blocked for 1800503 ticks =20 > >>=20 > >> Any chance to get all backtraces (e.g. thread apply all bt full > >> 16)? I think a different thread that held the lock have been > >> blocked, probably related to your disconnected vdev. =20 > >=20 > > Output of "thread apply all bt full 16" is available at:=20 > > http://www.fabiankeil.de/tmp/freebsd/kgdb-output-spa_errlog_lock-deadlo= ck.txt > > > > A lot of the backtraces prematurely end with "Cannot access memory > > at address", therefore I also added "thread apply all bt" output. > >=20 > > Apparently there are at least two additional threads blocking below > > spa_get_stats(): [...] > Yes, thread 1182 owned the lock and is waiting for the zio be done. > Other threads that wanted the lock would have to wait. >=20 > I don't have much clue why the system entered this state, however, as > the operations should have errored out (the GELI device is gone on > 21:44:56 based on your log, which suggests all references were closed) > instead of waiting. Thanks for the responses. I finally found the time to analyse the problem which seems to be that spa_sync() requires at least one writeable vdev to complete, but holds the lock(s) required to remove or bring back vdevs. Letting spa_sync() drop the lock and wait for at least one vdev to become writeable again seems to make the problem unreproducible for me, but probably merely shrinks the race window and thus is not a complete solution. For details see: https://www.fabiankeil.de/sourcecode/electrobsd/ZFS-Optionally-let-spa_sync= -wait-for-writable-vdev.diff (Experimental, only lightly tested) Fabian --Sig_/n/RaV4NiQBkXCK0r/1ymFqH Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlYwxm0ACgkQBYqIVf93VJ2AngCfePGkoeHRWCqRLVT27oFZS/bp vUEAnjYV7S6jmWHQVMYvXEJCN3//79k6 =wBhO -----END PGP SIGNATURE----- --Sig_/n/RaV4NiQBkXCK0r/1ymFqH--