Date: Wed, 26 Sep 2012 21:45:41 +0200 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: "Kenneth D. Merry" <ken@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, jdp@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, phk@FreeBSD.org Subject: Re: svn commit: r240822 - head/sys/geom Message-ID: <20120926194541.GB1402@garage.freebsd.pl> In-Reply-To: <20120926192117.GA89741@nargothrond.kdm.org> References: <201209221241.q8MCfnhJ067937@svn.freebsd.org> <20120925233712.GA26920@nargothrond.kdm.org> <20120926072005.GH1391@garage.freebsd.pl> <20120926172917.GA71268@nargothrond.kdm.org> <20120926185339.GA1402@garage.freebsd.pl> <20120926192117.GA89741@nargothrond.kdm.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--FkmkrVfFsRoUs1wW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Wed, Sep 26, 2012 at 01:21:17PM -0600, Kenneth D. Merry wrote:
> On Wed, Sep 26, 2012 at 20:53:39 +0200, Pawel Jakub Dawidek wrote:
> > On Wed, Sep 26, 2012 at 11:29:17AM -0600, Kenneth D. Merry wrote:
> > > Here is what CAM needs at each step:
> > >=20
> > > 1. When a device goes away, we need a method to call from daoninvali=
date()
> > > (or any other peripheral driver invalidate routine) with these
> > > properties:
> > > - It tells GEOM that the device has gone away, and starts the pro=
cess
> > > of shutting down the device. (i.e. withers/orphans the provide=
r)
> > > - It is callable from an interrupt context, with the SIM (MTX_DEF=
) lock
> > > held, so it can't sleep.
> >=20
> > Neither g_wither_provider() nor g_orphan_provider() require the topology
> > lock. They only acquire the event lock, but it is regular mutex, so this
> > is fine. Traversing geom's providers list looks like something that does
> > need the topology lock, but maybe traversing is not needed at all.
> > The reason for this change was a panic in iSCSI initiator where
> > disk_gone() was called and provider was destroyed before g_wither_geom()
> > returned.
>=20
> Ahh. How about using LIST_FOREACH_SAFE? Would that address the problem =
at
> hand? Are there any other races in there?
It depends. If one geom can hold more than one provider then it might be
racy, but from what I see there is always only one provider - there has
to be only one, because disk_destroy() destroys it and struct disk
represents always only one disk. If that's true then I see not reason to
have a loop in there. I'd change it to:
void
disk_gone(struct disk *dp)
{
struct g_geom *gp;
struct g_provider *pp;
gp =3D dp->d_geom;
if (gp !=3D NULL) {
pp =3D LIST_FIRST(&gp->provider);
if (pp !=3D NULL)
g_wither_provider(pp, ENXIO);
}
}
> > So maybe disk_destroy() should first orphan provider, which in turn will
> > set its error. If provider's error is set, all I/O requests will be
> > denied by GEOM by returning provider's error, so strategy method within
> > a driver won't be called.
>=20
> The current semantics of disk_destroy() are that the da(4) driver won't u=
se
> the disk structure after it is called. We can guarantee that if it is
> called from dacleanup(), but not if it is called from daoninvalidate().
>=20
> And if we combined the functionality of the current disk_gone() (which
> orphans the provider) and disk_destroy() routines, we would have to call =
it
> from daoninvalidate(). And that won't work, because the da(4) driver may
> well access elements of the disk structure after daoninvalidate() is
> called.
And I assume this is not something that can be fixed/changed?
--=20
Pawel Jakub Dawidek http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://tupytaj.pl
--FkmkrVfFsRoUs1wW
Content-Type: application/pgp-signature
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)
iEYEARECAAYFAlBjW2QACgkQForvXbEpPzSJzwCaA8fRqFe8CTvwVHbQSaPlYqxl
J3MAn3Wcf5ETX6pdg//OPtIU5JwyLTU7
=QgHR
-----END PGP SIGNATURE-----
--FkmkrVfFsRoUs1wW--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120926194541.GB1402>
