From owner-freebsd-stable@FreeBSD.ORG Thu Nov 23 10:13:03 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5C18916A412; Thu, 23 Nov 2006 10:13:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from fw.zoral.com.ua (fw.zoral.com.ua [213.186.206.134]) by mx1.FreeBSD.org (Postfix) with ESMTP id BB3EB43D5D; Thu, 23 Nov 2006 10:12:25 +0000 (GMT) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id kANACoDm026463 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 23 Nov 2006 12:12:50 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8) with ESMTP id kANACo14046371; Thu, 23 Nov 2006 12:12:50 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8/Submit) id kANACkLV046370; Thu, 23 Nov 2006 12:12:46 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 23 Nov 2006 12:12:46 +0200 From: Kostik Belousov To: Steven Hartland Message-ID: <20061123101246.GM1841@deviant.kiev.zoral.com.ua> References: <447366AD.30203@rogers.com> <20060701034922.GA37822@deviant.kiev.zoral.com.ua> <027b01c70e6c$6c879470$b3db87d4@multiplay.co.uk> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="LQv0vi9oZBoYbpa7" Content-Disposition: inline In-Reply-To: <027b01c70e6c$6c879470$b3db87d4@multiplay.co.uk> User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=1.4 required=5.0 tests=SPF_NEUTRAL, UNPARSEABLE_RELAY autolearn=no version=3.1.4 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-25) on fw.zoral.com.ua Cc: freebsd-fs@freebsd.org, Mike Jakubik , freebsd-stable@freebsd.org Subject: Re: md deadlocks on wdrain. Was: [Re: quota and snapshots in6.1-RELEASE] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Nov 2006 10:13:03 -0000 --LQv0vi9oZBoYbpa7 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 22, 2006 at 07:28:37PM -0000, Steven Hartland wrote: > The patch below fixed this issue for us. We had a jail which > when upgrading ( installworld ) from 5.4 to 6.1 would constantly > hang the machine with this error. >=20 > After updating md.c to 1.164 from MAIN and applying the patch > below I've managed to run installworld 3 times now without error. > Previously including updated to v1.164 this would hang without > fail. >=20 > If this is the correct fix, it would be good to see get committed > as it have the capability to knock out any box running a vnode > backed jail and is very unpredictable. This is not a fix, this is only a way to make the deadlock less frequent (I would not ever call it workaround). I have got a reports of deadlocks with this change applied, and I think that I understand the cause of it. Also, I have an idea of how to fix it, but not got around to even start coding. > ? sys/dev/md/.arch-ids > Index: sys/dev/md/md.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > RCS file: /usr/local/arch/ncvs/src/sys/dev/md/md.c,v > retrieving revision 1.164 > diff -u -r1.164 md.c > --- sys/dev/md/md.c 28 Mar 2006 21:25:11 -0000 1.164 > +++ sys/dev/md/md.c 1 Jul 2006 03:48:41 -0000 > @@ -650,6 +650,8 @@ > mtx_lock_spin(&sched_lock); > sched_prio(curthread, PRIBIO); > mtx_unlock_spin(&sched_lock); > + if (sc->type =3D=3D MD_VNODE) > + curthread->td_pflags |=3D TDP_NORUNNINGBUF; >=20 > for (;;) { > mtx_lock(&sc->queue_mtx); --LQv0vi9oZBoYbpa7 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFZXQdC3+MBN1Mb4gRAumDAJ9rK2AMG6wsyddAUqFGhjpeGah6FgCg9qUr +LWdv4tFahID2t7vBR6vD8I= =gFhp -----END PGP SIGNATURE----- --LQv0vi9oZBoYbpa7--