Date: Wed, 02 Nov 2011 17:09:42 +0200 From: Alexander Motin <mav@FreeBSD.org> To: Pawel Jakub Dawidek <pjd@FreeBSD.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r227015 - head/sys/geom Message-ID: <4EB15D36.9020409@FreeBSD.org> In-Reply-To: <20111102134226.GA1656@garage.freebsd.pl> References: <201111020924.pA29OxUV009135@svn.freebsd.org> <20111102134226.GA1656@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/02/11 15:42, Pawel Jakub Dawidek wrote: > On Wed, Nov 02, 2011 at 09:24:59AM +0000, Alexander Motin wrote: >> Author: mav >> Date: Wed Nov 2 09:24:59 2011 >> New Revision: 227015 >> URL: http://svn.freebsd.org/changeset/base/227015 >> >> Log: >> Add mutex and two flags to make orphan() call properly asynchronous: >> - delay consumer closing and detaching on orphan() until all I/Os complete; >> - prevent new I/Os submission after orphan() called. >> Previous implementation could destroy consumers still having active >> requests and worked only because of global workaround made on GEOM level. > > Alexander, I'm not sure I agree with your recent changes to address > this. The checks in GEOM were there to avoid the need for counting > outstanding I/O requests in every single GEOM class. Sorry, I've sent you letter last week asking for your opinion on this problem, but got no response. :( > Why do you think the checks in GEOM are not good enough? Mostly because nstart increment and request submission are not locked. There are race windows between start() call, request submission and consumer detach: start() method may get provider pointer that will not be valid in time of the request submission. According geom(4), that race should be closed by assumption that provider should not be closed until all active requests are completed. Kind of reference counting, done by top consumers, such as geom_dev or geom_vfs. That is what I am trying to fix with my changes. Also I don't very like idea of periodic polling, trying to catch moment when nstart == nend. As soon as at that time we haven't called orphan() method yet, requests may go infinitely, even though each will be aborted quickly. It looks dirty at least. Counting of outstanding I/O requests needed only for classes that for some reason can't follow that assumption. For example, geom_vfs releases provider as soon as it orphans, not waiting for close() call from file system. Another example is gmirror -- we want to drop single disconnected disk while upstream provider is still working and won't be closed. > Can we design solution that can be implemented in the framework itself, > so simple GEOM classes can stay simple? Simple classes without special needs around access() method and especially with one consumer should stay simple. All they should do on orphan() call is just forward orphan() call up, calling some g_wither_XXX() and not trying to forcefully close consumer, and let things happen naturally. The problem is that start() and bio_done() methods are not locked now, but same time they expect topology not to change under them. I am not sure how can it be simplified globally, unless we want to lock all of them with topology lock. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EB15D36.9020409>