From owner-freebsd-hackers Tue Aug 27 23: 5:54 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9739E37B400 for ; Tue, 27 Aug 2002 23:05:52 -0700 (PDT) Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1E0C943E72 for ; Tue, 27 Aug 2002 23:05:51 -0700 (PDT) (envelope-from grog@lemis.com) Received: by wantadilla.lemis.com (Postfix, from userid 1004) id B395581461; Wed, 28 Aug 2002 15:35:48 +0930 (CST) Date: Wed, 28 Aug 2002 15:35:48 +0930 From: Greg 'groggy' Lehey To: Doug Swarin Cc: Peter Edwards , gallatin@cs.duke.edu, hackers@FreeBSD.ORG Subject: Re: Vinum crash Message-ID: <20020828060548.GA16973@wantadilla.lemis.com> References: <20020823202017.0E2C043E3B@mx1.FreeBSD.org> <20020823155817.A82817@staff.texas.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020823155817.A82817@staff.texas.net> User-Agent: Mutt/1.3.99i Organization: The FreeBSD Project Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.FreeBSD.org/ X-PGP-Fingerprint: 9A1B 8202 BCCE B846 F92F 09AC 22E6 F290 507A 4223 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Friday, 23 August 2002 at 15:58:17 -0500, Doug Swarin wrote: > On Fri, Aug 23, 2002 at 09:20:02PM +0100, Peter Edwards wrote: >> "Peter Edwards" wrote: >> Urgh. Forget it, I was seeing references to rq that weren't there. >> >>> Hi, >>> >>> Ok, I'm up to my neck in code I've never seen and making wild >>> guesses, but: >>> >>> In vinumrequest.c:launch_requests(), isn't it possible that the >>> final BUF_STRATEGY() from line 431 completes before we get back to >>> the top of the outer "for" loop and that complete_rqe gets called >>> for the last buffer (we don't have splbio()), bringing the >>> refcount of the entire request down to zero, then freeing the >>> request. You then get to the top of the loop, and rq will have >>> been freed, but you looking at its contents. Ok, maybe not likely >>> but... >>> >>> I suppose you could just hold one more reference to the request >>> while doing launch_requests() and check after all theB UF_STRATEGYs >>> are done when you decrement the active count and find it's zero, >>> then do the "request-finished" processing as done by complete_rqe >>> Just a thought... > > I've already got a patch for this; it's in PR kern/41740, along with > another that allows you to safely hot-revive a striped plex. I checked in the first patch a couple of hours ago. It seems that it only affected ATA drives, which is why I wasn't able to reproduce it. The second patch is less obvious. It doesn't take into account that RAID plexes are also striped. I'll discuss this with you in private mail. Greg -- See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message