Date: Wed, 26 Sep 2012 13:58:20 -0600 From: "Kenneth D. Merry" <ken@FreeBSD.org> To: Pawel Jakub Dawidek <pjd@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, jdp@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, phk@FreeBSD.org Subject: Re: svn commit: r240822 - head/sys/geom Message-ID: <20120926195820.GA96844@nargothrond.kdm.org> In-Reply-To: <20120926194541.GB1402@garage.freebsd.pl> References: <201209221241.q8MCfnhJ067937@svn.freebsd.org> <20120925233712.GA26920@nargothrond.kdm.org> <20120926072005.GH1391@garage.freebsd.pl> <20120926172917.GA71268@nargothrond.kdm.org> <20120926185339.GA1402@garage.freebsd.pl> <20120926192117.GA89741@nargothrond.kdm.org> <20120926194541.GB1402@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Sep 26, 2012 at 21:45:41 +0200, Pawel Jakub Dawidek wrote: > On Wed, Sep 26, 2012 at 01:21:17PM -0600, Kenneth D. Merry wrote: > > On Wed, Sep 26, 2012 at 20:53:39 +0200, Pawel Jakub Dawidek wrote: > > > On Wed, Sep 26, 2012 at 11:29:17AM -0600, Kenneth D. Merry wrote: > > > > Here is what CAM needs at each step: > > > > > > > > 1. When a device goes away, we need a method to call from daoninvalidate() > > > > (or any other peripheral driver invalidate routine) with these > > > > properties: > > > > - It tells GEOM that the device has gone away, and starts the process > > > > of shutting down the device. (i.e. withers/orphans the provider) > > > > - It is callable from an interrupt context, with the SIM (MTX_DEF) lock > > > > held, so it can't sleep. > > > > > > Neither g_wither_provider() nor g_orphan_provider() require the topology > > > lock. They only acquire the event lock, but it is regular mutex, so this > > > is fine. Traversing geom's providers list looks like something that does > > > need the topology lock, but maybe traversing is not needed at all. > > > The reason for this change was a panic in iSCSI initiator where > > > disk_gone() was called and provider was destroyed before g_wither_geom() > > > returned. > > > > Ahh. How about using LIST_FOREACH_SAFE? Would that address the problem at > > hand? Are there any other races in there? > > It depends. If one geom can hold more than one provider then it might be > racy, but from what I see there is always only one provider - there has > to be only one, because disk_destroy() destroys it and struct disk > represents always only one disk. If that's true then I see not reason to > have a loop in there. I'd change it to: > > void > disk_gone(struct disk *dp) > { > struct g_geom *gp; > struct g_provider *pp; > > gp = dp->d_geom; > if (gp != NULL) { > pp = LIST_FIRST(&gp->provider); > if (pp != NULL) > g_wither_provider(pp, ENXIO); > } > } I would suggest doing LIST_FOREACH_SAFE() (with a comment explaining why) instead. That way just in case someone adds another provider down the road it will be handled properly. Otherwise we need a comment or KASSERT somewhere to explain that we depend on there only being one provider, and things will break if there is more than one. > > > So maybe disk_destroy() should first orphan provider, which in turn will > > > set its error. If provider's error is set, all I/O requests will be > > > denied by GEOM by returning provider's error, so strategy method within > > > a driver won't be called. > > > > The current semantics of disk_destroy() are that the da(4) driver won't use > > the disk structure after it is called. We can guarantee that if it is > > called from dacleanup(), but not if it is called from daoninvalidate(). > > > > And if we combined the functionality of the current disk_gone() (which > > orphans the provider) and disk_destroy() routines, we would have to call it > > from daoninvalidate(). And that won't work, because the da(4) driver may > > well access elements of the disk structure after daoninvalidate() is > > called. > > And I assume this is not something that can be fixed/changed? No, not really. It would probably take quite a bit of work to go to a two step process, and I'm not sure that it would even work in the end. Ken -- Kenneth Merry ken@FreeBSD.ORG
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120926195820.GA96844>