Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Apr 2012 09:31:37 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Ian Lepore <freebsd@damnhippie.dyndns.org>, freebsd-drivers@freebsd.org, current@freebsd.org
Subject:   Re: device_attach(9) and driver initialization
Message-ID:  <20120411063136.GI2358@deviant.kiev.zoral.com.ua>
In-Reply-To: <201204101447.37988.jhb@freebsd.org>
References:  <20120407125056.GS2358@deviant.kiev.zoral.com.ua> <201204100956.06750.jhb@freebsd.org> <20120410143835.GZ2358@deviant.kiev.zoral.com.ua> <201204101447.37988.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--4NXZt1nqaY8isg8F
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Apr 10, 2012 at 02:47:37PM -0400, John Baldwin wrote:
> On Tuesday, April 10, 2012 10:38:35 am Konstantin Belousov wrote:
> > On Tue, Apr 10, 2012 at 09:56:06AM -0400, John Baldwin wrote:
> > > On Monday, April 09, 2012 4:05:29 pm Konstantin Belousov wrote:
> > > > On Mon, Apr 09, 2012 at 03:36:08PM -0400, John Baldwin wrote:
> > > > > On Monday, April 09, 2012 3:10:00 pm Konstantin Belousov wrote:
> > > > > > On Mon, Apr 09, 2012 at 11:01:03AM -0400, John Baldwin wrote:
> > > > > > > On Saturday, April 07, 2012 11:08:41 pm Warner Losh wrote:
> > > > > > > >=20
> > > > > > > > On Apr 7, 2012, at 8:46 AM, Ian Lepore wrote:
> > > > > > > >=20
> > > > > > > > > On Sat, 2012-04-07 at 15:50 +0300, Konstantin Belousov wr=
ote:
> > > > > > > > >> Hello,
> > > > > > > > >> there seems to be a problem with device attach sequence =
offered by=20
> > > > > > > newbus.
> > > > > > > > >> Basically, when device attach method is executing, devic=
e is not fully
> > > > > > > > >> initialized yet. Also the device state in the newbus par=
t of the world
> > > > > > > > >> is DS_ALIVE. There is definitely no shattering news in t=
he statements,
> > > > > > > > >> but drivers that e.g. create devfs node to communicate w=
ith consumers
> > > > > > > > >> are prone to a race.
> > > > > > > > >>=20
> > > > > > > > >> If /dev node is created inside device attach method, the=
n usermode
> > > > > > > > >> can start calling cdevsw methods before device fully ini=
tialized itself.
> > > > > > > > >> Even more, if device tries to use newbus helpers in cdev=
sw methods,
> > > > > > > > >> like device_busy(9), then panic occurs "called for unatt=
eched device".
> > > > > > > > >> I get reports from users about this issues, to it is not=
 something
> > > > > > > > >> that only could happen.
> > > > > > > > >>=20
> > > > > > > > >> I propose to add DEVICE_AFTER_ATTACH() driver method, to=
 be called
> > > > > > > > >> from newbus right after device attach finished and newbu=
s considers
> > > > > > > > >> the device fully initialized. Driver then could create d=
evfs node
> > > > > > > > >> in the after_attach method instead of attach. Please see=
 the patch below.
> > > > > > > > >>=20
> > > > > > > > >> diff --git a/sys/kern/device_if.m b/sys/kern/device_if.m
> > > > > > > > >> index eb720eb..9db74e2 100644
> > > > > > > > >> --- a/sys/kern/device_if.m
> > > > > > > > >> +++ b/sys/kern/device_if.m
> > > > > > > > >> @@ -43,6 +43,10 @@ INTERFACE device;
> > > > > > > > >> # Default implementations of some methods.
> > > > > > > > >> #
> > > > > > > > >> CODE {
> > > > > > > > >> +	static void null_after_attach(device_t dev)
> > > > > > > > >> +	{
> > > > > > > > >> +	}
> > > > > > > > >> +
> > > > > > > > >> 	static int null_shutdown(device_t dev)
> > > > > > > > >> 	{
> > > > > > > > >> 	    return 0;
> > > > > > > > >> @@ -199,6 +203,21 @@ METHOD int attach {
> > > > > > > > >> };
> > > > > > > > >>=20
> > > > > > > > >> /**
> > > > > > > > >> + * @brief Notify the driver that device is in attached =
state
> > > > > > > > >> + *
> > > > > > > > >> + * Called after driver is successfully attached to the =
device and
> > > > > > > > >> + * corresponding device_t is fully operational. Driver =
now may expose
> > > > > > > > >> + * the device to the consumers, e.g. create devfs nodes.
> > > > > > > > >> + *
> > > > > > > > >> + * @param dev		the device to probe
> > > > > > > > >> + *
> > > > > > > > >> + * @see DEVICE_ATTACH()
> > > > > > > > >> + */
> > > > > > > > >> +METHOD void after_attach {
> > > > > > > > >> +	device_t dev;
> > > > > > > > >> +} DEFAULT null_after_attach;
> > > > > > > > >> +
> > > > > > > > >> +/**
> > > > > > > > >>  * @brief Detach a driver from a device.
> > > > > > > > >>  *
> > > > > > > > >>  * This can be called if the user is replacing the
> > > > > > > > >> diff --git a/sys/kern/subr_bus.c b/sys/kern/subr_bus.c
> > > > > > > > >> index d485b9f..6d849cb 100644
> > > > > > > > >> --- a/sys/kern/subr_bus.c
> > > > > > > > >> +++ b/sys/kern/subr_bus.c
> > > > > > > > >> @@ -2743,6 +2743,7 @@ device_attach(device_t dev)
> > > > > > > > >> 	dev->state =3D DS_ATTACHED;
> > > > > > > > >> 	dev->flags &=3D ~DF_DONENOMATCH;
> > > > > > > > >> 	devadded(dev);
> > > > > > > > >> +	DEVICE_AFTER_ATTACH(dev);
> > > > > > > > >> 	return (0);
> > > > > > > > >> }
> > > > > > > > >>=20
> > > > > > > > >=20
> > > > > > > > > Does device_get_softc() work before attach is completed? =
 (I don't have
> > > > > > > > > time to go look in the code right now).  If so, then a mu=
tex initialized
> > > > > > > > > and acquired early in the driver's attach routine, and al=
so acquired in
> > > > > > > > > the driver's cdev implementation routines before using an=
y newbus
> > > > > > > > > functions other than device_get_softc(), would solve the =
problem without
> > > > > > > > > a driver api change that would make it harder to backport=
/MFC driver
> > > > > > > > > changes.
> > > > > > > >=20
> > > > > > > > Also, more generally, don't create the dev nodes before you=
 are ready to=20
> > > > > > > deal with requests.  Why do we need to uglify everything here=
?  If you can't=20
> > > > > > > do that, you can check a bit in the softc and return EBUSY or=
 ENXIO on open if=20
> > > > > > > that bit says that your driver isn't ready to accept requests.
> > > > > > >=20
> > > > > > > I agree, this dosen't actually fix anything as the decision f=
or what to put
> > > > > > > in your foo_attach() method rather than foo_after_attach() is=
 non-obvious and=20
> > > > > > > very arbitrary.
> > > > > > >=20
> > > > > > > The actual bug appears to only be with using 'device_busy()'.=
 I think
> > > > > > > this should be fixed by making device_busy() better, not by a=
dding
> > > > > > > this type of obfuscation to attach. It should be trivial to m=
ake
> > > > > > > device_busy() safe to use on a device that is currently being=
 attached
> > > > > > > which will not require any changes to drivers.
> > > > > >=20
> > > > > > Could you, please, elaborate your proposal ? How do you think d=
evice_busy()
> > > > > > can be enchanced ?
> > > > > >=20
> > > > > > Obvious idea to sleep inside device_busy() until dev->state bec=
omes !=3D
> > > > > > DS_ATTACHED is no go, IMO. The issue is that this causes immedi=
ate deadlocks
> > > > > > if device_attach() method needs to call destroy_dev() to rollba=
ck.
> > > > >=20
> > > > > I think you could have a DS_ATTACHING state and allow device_busy=
() to work
> > > > > for DS_ATTACHING.  The idea being that it is a bug for a driver t=
o invoke
> > > > > device_busy() if it is going to fail attach.  You may then need t=
o do a fixup
> > > > > in device_attach() to promote the state from DS_ATTACHED to DS_BU=
SY when it
> > > > > returns if there is a non-zero busy count.
> > > > This is quite good idea, but it still adds burden to device author,
> > > > although I agree that this is manageable. A scenario I have in mind=
 now
> > > > is the following:
> > > > assume that driver needs to create two devfs nodes, lets name them
> > > > dri/card0 and dri/forcewake0. Driver would perform two make_dev_p(9)
> > > > calls, and while creation of dri/card0 succeed, consequent creation
> > > > of dri/forcewake0 could fail for numerous reasons.
> > > >=20
> > > > Now, the driver needs to ensure that cdesvw->d_open() on dri/card0
> > > > would return ENXIO until dri/forcewake0 is created. This can be imp=
lemented
> > > > with flag, indeed. But still somewhat muddy, and probably leads to
> > > > user-visible errors (I mostly worry about graphical login managers).
> > >=20
> > > You could also sleep on the flag in d_open() (you can imagine a two-s=
tep
> > > process where you set a "adding cdev's flag", then all d_open() calls=
 block
> > > on it).  Then when finished adding cdev's, you set a flag if an error
> > > occurred and wake up all the waiters.  If no error occured the waiter=
s can
> > > have the first open() work fine.  But even with other proposals you s=
till
> > > have to deal with this problem if you want to fail out entirely if you
> > > have problems creating cdevs.
> > >=20
> > > > But for single-node drivers it is indeed a nice solution.
> > >=20
> > > I think we are somewhat stuck with this for other reasons as well.  N=
ote
> > > that device_busy() propagates up the tree to parent devices, so imagi=
ne
> > > kldloading a driver that creates a tree (e.g. a bus with a few consum=
ers)
> > > where the leaf devices are attached by a call to bus_generic_attach()=
 from
> > > the device_attach() method for the parent device.  Even if you add a
> > > DEVICE_AFTER_ATTACH() hook, while it may allow device_busy() to be in=
voked
> > > on the leaf device, when it tries to propagate device_busy() up to the
> > > parent device it would still be in the middle of attach and blow up a=
nyway.
> >=20
> > This looks like a bug in my implementation of after_attach, which
> > apparently should be called for new tree after the top level attach
> > finished.
>=20
> Hmm, I think this is a bit less trivial as it involves an entire pass over
> the tree (and you'd probably not want to invoke it more than once on a
> device, so you'd have to track that, etc.). =20
Yes, I understand that the fix for the problem with after_attach() somewhat
involved. I either could mark each device in the tree as being notified,
and traverse the tree after top-level attach, or keep a deferred list of
devices to notify. I also would need to distinguish top-level device_attach=
()
against nested attaches.

>=20
> > Anyway, would you commit your change ? I definitely can work out the
> > driver change after. But this seems to be a large amount of work for
> > driver authors.
>=20
> Oh, sure.  Have you tried it out?  I have not tested it yet, so I will ne=
ed to
> do that first unless you can do so easily.
Yes, I did in a sense that drm uses device_busy(). I pushed 14.3 Intel patch
to web, and specifically mailed a reporter about the possible fix for his
problem. Lets see how it goes.

I am still worry about ENXIO for login managers, but this would be next
item.

--4NXZt1nqaY8isg8F
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAk+FJUgACgkQC3+MBN1Mb4g0kACfQA4/E0TlZ1ltssJIAG4J0r2K
1pkAmgMxMyPNGrUhZFCY+LXGJ17ZMRa3
=81ac
-----END PGP SIGNATURE-----

--4NXZt1nqaY8isg8F--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120411063136.GI2358>