From owner-freebsd-hackers@freebsd.org Mon Nov 28 16:49:31 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 29535C59224 for ; Mon, 28 Nov 2016 16:49:31 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from smtprelay04.ispgateway.de (smtprelay04.ispgateway.de [80.67.31.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E415E1D6D for ; Mon, 28 Nov 2016 16:49:30 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from [78.35.140.237] (helo=fabiankeil.de) by smtprelay04.ispgateway.de with esmtpsa (TLSv1.2:AES256-GCM-SHA384:256) (Exim 4.84) (envelope-from ) id 1cBOzu-0001JL-LG; Mon, 28 Nov 2016 17:41:42 +0100 Date: Mon, 28 Nov 2016 17:41:40 +0100 From: Fabian Keil To: Konstantin Belousov Cc: freebsd-hackers@freebsd.org Subject: Re: FreeBSD 11 i386 disk deadlock (I think) (now with reproduction steps!) Message-ID: <20161128174140.6635a726@fabiankeil.de> In-Reply-To: <20161128160311.GQ54029@kib.kiev.ua> References: <20161128041847.GA65249@charmander> <20161128120046.GP54029@kib.kiev.ua> <20161128144135.10f93205@fabiankeil.de> <20161128160311.GQ54029@kib.kiev.ua> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; boundary="Sig_/R9=vfUUF6L/il=NEp96Z1ks"; protocol="application/pgp-signature" X-Df-Sender: Nzc1MDY3 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2016 16:49:31 -0000 --Sig_/R9=vfUUF6L/il=NEp96Z1ks Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Konstantin Belousov wrote: > On Mon, Nov 28, 2016 at 02:43:30PM +0100, Fabian Keil wrote: > > David Cross wrote: > > =20 > > > This is certainly new behavior, or a new manifestation. =20 > >=20 > > Recently a couple of uma consumers were changed to share uma zones > > instead of using a dedicated zone. As a result geli competes with > > more uma consumers and is more likely to deadlock. The bug isn't > > new, it's just triggered more often now. =20 > The problem happens on layer much lower than UMA, it is whole reusable > page pool which is depleted and cannot be re-filled without allocating > more memory. If you think about it, the deadlock is obviously trivial: > pagedaemon is the main source of the free pages, but if producing free > page requires allocating one, low memory condition is equal to deadlock. >=20 > It was always there, in the sense that for all versions of freebsd, if > file/disk write path requires memory allocation, there is the trouble. >=20 > For geom, some special unique measures were taken so that bio allocations > do not cause the issue in typical situations. >=20 > > geli isn't the only uma consumer that is affected: > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209680 =20 It's been a couple of months since I looked into this and apparently I misremembered. The commits I was thinking of didn't actually modify UMA consumers to use shared zones instead of dedicated zones but removed the UMA_ZONE_NOFREE flag which makes issues like the one reported in the PR above more likely. However, this should not negatively affect UMA consumers that use different zones and should be unrelated to the geli deadlocks. On my systems the patch from #209759 reliably prevents the geli deadlocks when paging, but I do not remember why the issue became more pressing recently. Fabian --Sig_/R9=vfUUF6L/il=NEp96Z1ks Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlg8XkYACgkQBYqIVf93VJ3Z3wCdHWmBsM5SXy1g7XQhU+4H0J8j D7IAn0WsLYYAOAvHz4umN1WTcxOVIabF =RmKu -----END PGP SIGNATURE----- --Sig_/R9=vfUUF6L/il=NEp96Z1ks--