Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Feb 2010 16:00:38 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Andrew Gallatin <gallatin@cs.duke.edu>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: devfs panic w/INVARIANTS
Message-ID:  <20100205140038.GR15587@deviant.kiev.zoral.com.ua>
In-Reply-To: <4B6C225D.3020306@cs.duke.edu>
References:  <4B6B30BC.7030107@cs.duke.edu> <20100205100643.GQ15587@deviant.kiev.zoral.com.ua> <4B6C225D.3020306@cs.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help

--vIXBmblrD40XNCy4
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Feb 05, 2010 at 08:51:25AM -0500, Andrew Gallatin wrote:
> Kostik Belousov wrote:
> >On Thu, Feb 04, 2010 at 03:40:28PM -0500, Andrew Gallatin wrote:
> >>I've got a commercial driver that uses device cloning.
> >>At unload time, the driver calls clone_cleanup(). When I unload
> >>the driver when the kernel is built with INVARIANTS, I'll see a
> >>panic in devfs_populate_loop().  This happens in 6-stable,
> >>as well as 8-stable.
> >>
> >>From what I can see the clone has been freed, but it
> >>remains on the devfs cdevp_list.   Then the next time
> >>devfs_populate_loop() is called, it trips over the bad
> >>entry (cdp->cdp_dirents points to 0xdeadc0dedeadc0de)
> >>See appended kgdb session.
> >>
> >>If I trace the code path, it looks like clone_cleanup()
> >>calls destroy_devl().  And destroy_devl() will eventually
> >>call devfs_free() if the si_refcnt is zero.  But I don't
> >>see anything which will get the cdev removed from
> >>the cdevp_list prior to it being freed.
> >>
> >>The only code I see which will get the cdev removed from
> >>the cdevp_list() seems to be the "GC any lingering devices"
> >>block in devfs_populate_loop
> >>
> >>What am I missing?
> >
> >You did not mentioned it, but my guess is that you create clones from
> >the dev_clone event handler. Please note that devfs_lookup() that fires
>=20
> Yes, I do.
>=20
> >dev_clone event, consumes a device reference. Thus clone handlers shall
> >do dev_ref().
> >
> >Due to races with cleanup, you should use MAKEDEV_REF flag for
> >make_dev_credv(9) KPI instead of doing make_dev()/dev_ref() pair.
>=20
> I need to support FreeBSD going all the way back to 6, so that's not an
> option in some versions.
>=20
> But, I'm talking about device removal time.  If I call clone_cleanup()
> where the clones have dev->si_refcount=3D=3D1, then I get the use-after-f=
ree
> panic.  If I hack things to elevate the reference count (such that
> dev->si_refcount=3D=3D2 when clone_cleanup() is called), then I don't
> get the panic.
>=20
> Are you saying I should have been taking the extra reference
> via my dev_clone eventhandler?   Won't having the extra reference
> lead to a memory leak?   Or am I just mis-reading the code, and
> this will lead to things being freed normally?
Yes, clone handler shall do dev_ref(). Either by doing race-free
make_dev_credf(MAKEDEV_REF) call, or by using dev_ref() after make_dev().

>=20
> >That said, do you really need clones at all ?
>=20
> I need to support FreeBSD back to 6.x, and I need to support the
> linux-like model of opening the "same" /dev/node multiple times
> and getting unique handles.  So I think I need clones.

Wouldn't it be cleaner to use cdevpriv for the 7/8/HEAD where it is
present ? And have special #ifdef-ed code for 6, that could be
eventually dropped.

--vIXBmblrD40XNCy4
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (FreeBSD)

iEYEARECAAYFAktsJIYACgkQC3+MBN1Mb4gZxwCfUVGiWLuSHITnOqzaTVAKE8K3
oBgAn1PJj+NO92S5+md5KAVxQ8Pn1DPH
=R2Ko
-----END PGP SIGNATURE-----

--vIXBmblrD40XNCy4--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100205140038.GR15587>