Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Sep 2004 20:36:42 +0300
From:      Peter Pentchev <roam@ringlet.net>
To:        "M. Warner Losh" <imp@bsdimp.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: [PATCH] Fix USB panics
Message-ID:  <20040908173642.GB1924@straylight.m.ringlet.net>
In-Reply-To: <20040908.111706.115908959.imp@bsdimp.com>
References:  <20040908150943.GA1924@straylight.m.ringlet.net> <20040908.111706.115908959.imp@bsdimp.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--NklN7DEeGtkPCoo3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Sep 08, 2004 at 11:17:06AM -0600, M. Warner Losh wrote:
> [[ redirected to hackers only ]]

Thanks for looking over this so quickly!

> In message: <20040908150943.GA1924@straylight.m.ringlet.net>
>             Peter Pentchev <roam@ringlet.net> writes:
[snip]
> : So here's a How to Panic a RELENG_5 Kernel In Five Easy Steps :)
> :=20
> : 1.  Make sure usbd is not running so it cannot load drivers on demand.
> :=20
> : 2.  kldload usb, and do not load *any* USB device drivers, not even uge=
n.
> :=20
> : 3.  Plug a device, any device.
> : 3a. usbd_probe_and_attach() looks for a specific driver, and fails.
> : 3b. usbd_probe_and_attach() looks for a generic driver, and fails.
> : 3c. usbd_probe_and_attach() leaves the newly-allocated device_t
> :     structure in place (and added to the parent bus), but removes any
> :     subdevs traces in the usbd_device_handle.
>=20
> Leaving it in place is the right thing to do.  It looks like the
> second half of this isn't.

Ahh... hold on to this thought, now.

> : So.. what should be done about it?  Basically, the problem is that
> : usbd_probe_and_attach() leaves an orphaned device_t pointing to the
> : usbd_device_handle that will be freed when the hardware device is
> : physically unplugged.
>=20
> I'd use different terms to describe it, but what you are saying is
> essentially correct.=20

Yeah, well, this is the first time I dip my fingers into both the USB
code and the busdma framework at all, so I'm probably saying a lot of
things the wrong way :)

> : The attached patch seems to solve the problem - and a couple of related
> : others - at least for me.  The #ifdef's are effectively #if 0's, I've
> : just left them in to keep a perspective on the original code.  The idea
> : is to keep the device_t pointer in the usbd_device_handle, but take care
> : to check if the device is actually attached before dereferencing it.
>=20
> Hmmmm

That's the third chunk of the patch, where I added the
device_is_attached() check to usbd_fill_deviceinfo().  Until now, the
device could never possibly have been NOT attached, since
usbd_probe_and_attach() would never have left a non-NULL subdev in
place.  How's that for a double.. no, quadruple negative in the last
sentence, now?

[snip]
> : Is this even the right direction?  IMHO, this situation ought to be
> : resolved one way or another before 5.3 hits the shelves, even if the
> : solution has nothing to do with my proposed patch :)
>=20
> Yes.  I don't like parts of it, but they are the parts that you
> yourself don't like either :-(.
>=20
> : Index: dev/usb/uhub.c
> : =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> : RCS file: /home/ncvs/src/sys/dev/usb/uhub.c,v
> : retrieving revision 1.62
> : diff -u -r1.62 uhub.c
> : --- dev/usb/uhub.c	15 Aug 2004 23:39:18 -0000	1.62
> : +++ dev/usb/uhub.c	8 Sep 2004 14:06:45 -0000
> : @@ -707,7 +707,9 @@
> :  			continue;
> :  		for (i =3D 0; dev->subdevs[i]; i++) {
> :  			if (dev->subdevs[i] =3D=3D child) {
> : +#ifdef ROAM_SKIP_USB_DEVICE_LEAK
> :  				dev->subdevs[i] =3D NULL;
> : +#endif
> :  				return;
> :  			}
> :  		}
>=20
> I agree with your assessment that the above really makes the routine:
>=20
> Static void
> uhub_child_detached(device_t self, device_t child)
> {
> 	struct uhub_softc *sc =3D device_get_softc(self);
> 	usbd_device_handle devhub =3D sc->sc_hub;
>=20
> 	if (!devhub->hub)
> 		/* should never happen; children are only created after init */
> 		panic("hub not fully initialised, but child deleted?");
> }
>=20
> which looks right to me (although having the panic there seems overly
> defensive).

Yep, that's why I wondered if the whole function should not be removed.
Still, it might be a good thing to leave it in, just in case - after
all, this is not an overly-often executed path.

> : Index: dev/usb/usb_subr.c
> : =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> : RCS file: /home/ncvs/src/sys/dev/usb/usb_subr.c,v
> : retrieving revision 1.69
> : diff -u -r1.69 usb_subr.c
> : --- dev/usb/usb_subr.c	15 Aug 2004 23:39:18 -0000	1.69
> : +++ dev/usb/usb_subr.c	8 Sep 2004 14:05:51 -0000
> : @@ -1016,9 +1016,11 @@
> :  	if (dv !=3D NULL) {
> :  		return (USBD_NORMAL_COMPLETION);
> :  	}
> : +#ifdef ROAM_SKIP_USB_DEVICE_LEAK
> :  	tmpdv =3D dev->subdevs;
> :  	dev->subdevs =3D 0;
> :  	free(tmpdv, M_USB);
> : +#endif
> : =20
> :  	/*
> :  	 * The generic attach failed, but leave the device as it is.
>=20
> I'm not sure I understand why the above fixes anything.  Is it because
> it makes the proper references go away?  If so, it can likely be
> deleted.  I'll need to thread through the code to see if any of the
> other instances of those 3 lines of code can go.

Remember the thought I told you to hold on to? :)  This is the part
where usbd_probe_and_attach() removes the usbd_device_handle's reference
to the device_t - and now usb_disconnect_port() thinks it's safe to
remove the usbd_device_handle.  However, the device_t itself still holds
a reference to it in the ivars.  Removing these particular three lines
of code right here makes usbd_probe_and_attach() keep the subdevs ref,
as it very well should.

It's true that these same lines are used in two other places, but it's
okay - it's in usbd_probe_and_attach() too, it's just garbage collection
after the previous unsuccessful attempts to attach a driver.  At least I
think it's okay, since there might be a fun failure case if a driver
tried to attach and created another device_t, then failed to attach and
for some reason did not remove the device_t.  This sounds suspiciously
like the situation we're up against right here... it makes me wonder
now, although until now I was convinced it was okay to free the subdevs
in the previous failed attempts.

> : @@ -1346,7 +1348,7 @@
> :  	di->udi_speed =3D dev->speed;
> : =20
> :  	if (dev->subdevs !=3D NULL) {
> : -		for (i =3D 0; dev->subdevs[i] &&
> : +		for (i =3D 0; dev->subdevs[i] && device_is_attached(dev->subdevs[i])=
 &&
> :  			     i < USB_MAX_DEVNAMES; i++) {
> :  			strncpy(di->udi_devnames[i], USBDEVPTRNAME(dev->subdevs[i]),
> :  				USB_MAX_DEVNAMELEN);
>=20
> OK.  I understand this as well.  It likely is a good change.

Well, it is a no-op unless the three lines above are removed, since if
usbd_probe_and_attach() fails, there won't be any non-null subdevs.

G'luck,
Peter

--=20
Peter Pentchev	roam@ringlet.net    roam@cnsys.bg    roam@FreeBSD.org
PGP key:	http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint	FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
Hey, out there - is it *you* reading me, or is it someone else?

--NklN7DEeGtkPCoo3
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (FreeBSD)

iD8DBQFBP0Mq7Ri2jRYZRVMRApySAJ948KCQqXEpHpH8sGpnGR8gm/aSAgCgksyS
FjkwSL+stWwvrCnCLv95F0Q=
=/Hph
-----END PGP SIGNATURE-----

--NklN7DEeGtkPCoo3--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040908173642.GB1924>