Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 9 Apr 2012 17:59:13 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Ian Lepore <freebsd@damnhippie.dyndns.org>
Cc:        current@freebsd.org, drivers@freebsd.org
Subject:   Re: device_attach(9) and driver initialization
Message-ID:  <20120409145913.GR2358@deviant.kiev.zoral.com.ua>
In-Reply-To: <1333982475.1082.61.camel@revolution.hippie.lan>
References:  <20120407125056.GS2358@deviant.kiev.zoral.com.ua> <1333810001.1082.36.camel@revolution.hippie.lan> <20120407145728.GV2358@deviant.kiev.zoral.com.ua> <CCBC36A4-CC13-465B-A151-A41DEDF7E55B@bsdimp.com> <20120408035838.GY2358@deviant.kiev.zoral.com.ua> <1333982475.1082.61.camel@revolution.hippie.lan>

next in thread | previous in thread | raw e-mail | index | archive | help

--aLNv5lbzIJDYDUXB
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Apr 09, 2012 at 08:41:15AM -0600, Ian Lepore wrote:
> On Sun, 2012-04-08 at 06:58 +0300, Konstantin Belousov wrote:
> > On Sat, Apr 07, 2012 at 09:10:55PM -0600, Warner Losh wrote:
> > >=20
> > > On Apr 7, 2012, at 8:57 AM, Konstantin Belousov wrote:
> > >=20
> > > > On Sat, Apr 07, 2012 at 08:46:41AM -0600, Ian Lepore wrote:
> > > >> On Sat, 2012-04-07 at 15:50 +0300, Konstantin Belousov wrote:
> > > >>> Hello,
> > > >>> there seems to be a problem with device attach sequence offered b=
y newbus.
> > > >>> Basically, when device attach method is executing, device is not =
fully
> > > >>> initialized yet. Also the device state in the newbus part of the =
world
> > > >>> is DS_ALIVE. There is definitely no shattering news in the statem=
ents,
> > > >>> but drivers that e.g. create devfs node to communicate with consu=
mers
> > > >>> are prone to a race.
> > > >>>=20
> > > >>> If /dev node is created inside device attach method, then usermode
> > > >>> can start calling cdevsw methods before device fully initialized =
itself.
> > > >>> Even more, if device tries to use newbus helpers in cdevsw method=
s,
> > > >>> like device_busy(9), then panic occurs "called for unatteched dev=
ice".
> > > >>> I get reports from users about this issues, to it is not something
> > > >>> that only could happen.
> > > >>>=20
> > > >>> I propose to add DEVICE_AFTER_ATTACH() driver method, to be called
> > > >>> from newbus right after device attach finished and newbus conside=
rs
> > > >>> the device fully initialized. Driver then could create devfs node
> > > >>> in the after_attach method instead of attach. Please see the patc=
h below.
> > > >>>=20
> > > >>> diff --git a/sys/kern/device_if.m b/sys/kern/device_if.m
> > > >>> index eb720eb..9db74e2 100644
> > > >>> --- a/sys/kern/device_if.m
> > > >>> +++ b/sys/kern/device_if.m
> > > >>> @@ -43,6 +43,10 @@ INTERFACE device;
> > > >>> # Default implementations of some methods.
> > > >>> #
> > > >>> CODE {
> > > >>> +	static void null_after_attach(device_t dev)
> > > >>> +	{
> > > >>> +	}
> > > >>> +
> > > >>> 	static int null_shutdown(device_t dev)
> > > >>> 	{
> > > >>> 	    return 0;
> > > >>> @@ -199,6 +203,21 @@ METHOD int attach {
> > > >>> };
> > > >>>=20
> > > >>> /**
> > > >>> + * @brief Notify the driver that device is in attached state
> > > >>> + *
> > > >>> + * Called after driver is successfully attached to the device and
> > > >>> + * corresponding device_t is fully operational. Driver now may e=
xpose
> > > >>> + * the device to the consumers, e.g. create devfs nodes.
> > > >>> + *
> > > >>> + * @param dev		the device to probe
> > > >>> + *
> > > >>> + * @see DEVICE_ATTACH()
> > > >>> + */
> > > >>> +METHOD void after_attach {
> > > >>> +	device_t dev;
> > > >>> +} DEFAULT null_after_attach;
> > > >>> +
> > > >>> +/**
> > > >>>  * @brief Detach a driver from a device.
> > > >>>  *
> > > >>>  * This can be called if the user is replacing the
> > > >>> diff --git a/sys/kern/subr_bus.c b/sys/kern/subr_bus.c
> > > >>> index d485b9f..6d849cb 100644
> > > >>> --- a/sys/kern/subr_bus.c
> > > >>> +++ b/sys/kern/subr_bus.c
> > > >>> @@ -2743,6 +2743,7 @@ device_attach(device_t dev)
> > > >>> 	dev->state =3D DS_ATTACHED;
> > > >>> 	dev->flags &=3D ~DF_DONENOMATCH;
> > > >>> 	devadded(dev);
> > > >>> +	DEVICE_AFTER_ATTACH(dev);
> > > >>> 	return (0);
> > > >>> }
> > > >>>=20
> > > >>=20
> > > >> Does device_get_softc() work before attach is completed?  (I don't=
 have
> > > >> time to go look in the code right now).  If so, then a mutex initi=
alized
> > > >> and acquired early in the driver's attach routine, and also acquir=
ed in
> > > >> the driver's cdev implementation routines before using any newbus
> > > >> functions other than device_get_softc(), would solve the problem w=
ithout
> > > >> a driver api change that would make it harder to backport/MFC driv=
er
> > > >> changes.
> > > > No, 'a mutex' does not solve anything. It only adds enourmous burden
> > > > on the driver developers, because you cannot sleep under mutex. Cha=
nging
> > > > the mutex to the sleepable lock also does not byy you much, since
> > > > you need to somehow solve the issues with some cdevsw call waking up
> > > > thread sleeping into another cdevsw call, just for example.
> > > >=20
> > > > Singlethreading a driver due to this race is just silly.
> > > >=20
> > > > And, what do you mean by 'making it harder to MFC' ? How ?
> > >=20
> > > driver_attach()
> > > {
> > > 	...
> > > 	softc->flags =3D 0; // redundant, since softc is initialized to 0.
> > > 	softc->cdev =3D device_create...();
> > > 	...
> > > 	softc->flags |=3D READY;
> > > }
> > >=20
> > > driver_open(...)
> > > {
> > > 	if (!(softc->flags & READY))
> > > 		return ENXIO;
> > > 	...
> > > }
> > >=20
> > > What's the big burden here?
> > The burden is that your proposal does not work. As I described above,
> > device_busy() calls from cdevsw method panic if open() is called before
> > DS_ATTACHED is set by newbus. And, DS_ATTACHED is only set after device
> > attach() method returned, so no workarounds from attach() could solve
> > this.
>=20
> One thing that keeps floating to the front of my brain is that all the
> proposals so far (including my not-well-thought-out mutex suggestion)
> requires changing every existing driver to get the new safe behavior.
>=20
> Hmmm.  Looking at the code, not very many drivers call device_busy().
> Why is that? =20
>=20
> I agree that calling device_create() should be deferred until the driver
You mean make_dev(9), I assume.

> is ready to handle requests.  That's only part of the fix if the newbus
> support routines are still going to have a window where they can panic
> because the internal state variables haven't yet transitioned to the
> correct state. =20
>=20
> Also, the implementation of device_busy() looks to be unsafe unless it's
> being implicitly protected by some locking in a call chain that isn't
> jumping out at me with simple grepping of the code.  For example,
> concurrent callers in a device's open() and close() methods for a driver
> that calls busy/unbusy from cdev open/close could leave the parent
> device's busy count in an indeterminate state.  Could fixing this by
> enforcing single threading through busy/unbusy provide an opportunity to
> fix the original problem by having the attach code acquire the same
> lock, so that an early call to a cdev method that invokes device_busy()
> ends up sleeping until the attach routine returns?  That way you don't
> single-thread the whole cdev open/close handling, just the part that's
> currently causing a problem.
I think device_busy() calls require Giant, the same as the whole newbus.
The absence of GIANT_REQUIRED assertions in device_busy() and device_unbusy=
()
looks like overlook.

--aLNv5lbzIJDYDUXB
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAk+C+UEACgkQC3+MBN1Mb4jyyQCfVT7lHgGeiLMsjUFMCQIgzM9e
6lMAniYIyja/0aayMj78aYDJgAfpU6WC
=HEjq
-----END PGP SIGNATURE-----

--aLNv5lbzIJDYDUXB--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120409145913.GR2358>