From owner-svn-src-head@FreeBSD.ORG Thu Nov 3 08:34:42 2011 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4A8F1106566C; Thu, 3 Nov 2011 08:34:42 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0D8538FC17; Thu, 3 Nov 2011 08:34:40 +0000 (UTC) Received: by eyd10 with SMTP id 10so1241233eyd.13 for ; Thu, 03 Nov 2011 01:34:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; bh=1oq0WsaZFI3pG1BdtJ7TeIk9E+5Xkx1x+cTrwBbEXgI=; b=rCD24DK9fQq+VFhYrJduZZCo0TiXNYXAfkz7OUVswNekfCNmPwb8Gmofs/qW2ykR0D jVn7LDZSmIGKVzFQ5PYQD7+mccSDtgi/XdvelB8djPn3brRhtnAUzii2BKZ6X7josfvX YOFDh/HU3JRLct7N265bvMpeWnljU2PiU1yWk= Received: by 10.14.3.232 with SMTP id 80mr724546eeh.117.1320309279874; Thu, 03 Nov 2011 01:34:39 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226]) by mx.google.com with ESMTPS id q28sm13632627eea.6.2011.11.03.01.34.38 (version=SSLv3 cipher=OTHER); Thu, 03 Nov 2011 01:34:38 -0700 (PDT) Sender: Alexander Motin Message-ID: <4EB2521D.2090704@FreeBSD.org> Date: Thu, 03 Nov 2011 10:34:37 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0.2) Gecko/20110910 Thunderbird/6.0.2 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <201111020924.pA29OxUV009135@svn.freebsd.org> <20111102134226.GA1656@garage.freebsd.pl> <4EB15D36.9020409@FreeBSD.org> <20111103075132.GA1682@garage.freebsd.pl> In-Reply-To: <20111103075132.GA1682@garage.freebsd.pl> X-Enigmail-Version: undefined Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r227015 - head/sys/geom X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Nov 2011 08:34:42 -0000 On 03.11.2011 09:51, Pawel Jakub Dawidek wrote: > On Wed, Nov 02, 2011 at 05:09:42PM +0200, Alexander Motin wrote: >> On 11/02/11 15:42, Pawel Jakub Dawidek wrote: >>> On Wed, Nov 02, 2011 at 09:24:59AM +0000, Alexander Motin wrote: >>>> Author: mav >>>> Date: Wed Nov 2 09:24:59 2011 >>>> New Revision: 227015 >>>> URL: http://svn.freebsd.org/changeset/base/227015 >>>> >>>> Log: >>>> Add mutex and two flags to make orphan() call properly asynchronous: >>>> - delay consumer closing and detaching on orphan() until all I/Os complete; >>>> - prevent new I/Os submission after orphan() called. >>>> Previous implementation could destroy consumers still having active >>>> requests and worked only because of global workaround made on GEOM level. >>> >>> Alexander, I'm not sure I agree with your recent changes to address >>> this. The checks in GEOM were there to avoid the need for counting >>> outstanding I/O requests in every single GEOM class. >> >> Sorry, I've sent you letter last week asking for your opinion on this >> problem, but got no response. :( > > I did not receive it, sorry. > >>> Why do you think the checks in GEOM are not good enough? >> >> Mostly because nstart increment and request submission are not locked. >> There are race windows between start() call, request submission and >> consumer detach: start() method may get provider pointer that will not >> be valid in time of the request submission. >> >> According geom(4), that race should be closed by assumption that >> provider should not be closed until all active requests are completed. >> Kind of reference counting, done by top consumers, such as geom_dev or >> geom_vfs. That is what I am trying to fix with my changes. >> >> Also I don't very like idea of periodic polling, trying to catch moment >> when nstart == nend. As soon as at that time we haven't called orphan() >> method yet, requests may go infinitely, even though each will be aborted >> quickly. It looks dirty at least. >> >> Counting of outstanding I/O requests needed only for classes that for >> some reason can't follow that assumption. For example, geom_vfs releases >> provider as soon as it orphans, not waiting for close() call from file >> system. Another example is gmirror -- we want to drop single >> disconnected disk while upstream provider is still working and won't be >> closed. > > Ok, I can see your point now. You are right. The check that nstart is > equal to nend I added was not to fix races within one class. IIRC I was > seeing panics there with simple classes that only forward orphan events, > so even if provider's error was set and no new I/O requests were coming, > we could destroy orphaned provider even if there were still in-flight > requests. I was a bit confused, because you were refering to the check I > added and I don't think it is really related. Your check does good job to the original problem, reducing the race window from huge start() -- bio_done() to two small start() -- nstart++ and nend++ -- bio_done(). It protects geoms below the broken one (panics I see without check usually happen on random level below the real source), but it does not protect the broken geom itself. Proper orphan() methods protect both. -- Alexander Motin