Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Jan 2013 01:45:54 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        "Kenneth D. Merry" <ken@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, alc@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: svn commit: r227530 - head/sys/vm
Message-ID:  <20130108234554.GD2561@kib.kiev.ua>
In-Reply-To: <20130108230311.GA77847@nargothrond.kdm.org>
References:  <201111151440.pAFEe0xJ004636@svn.freebsd.org> <20130108221134.GA75088@nargothrond.kdm.org> <20130108224931.GC2561@kib.kiev.ua> <20130108230311.GA77847@nargothrond.kdm.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--bjuZg6miEcdLYP6q
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Jan 08, 2013 at 04:03:11PM -0700, Kenneth D. Merry wrote:
> On Wed, Jan 09, 2013 at 00:49:31 +0200, Konstantin Belousov wrote:
> > On Tue, Jan 08, 2013 at 03:11:34PM -0700, Kenneth D. Merry wrote:
> > > On Tue, Nov 15, 2011 at 14:40:00 +0000, Konstantin Belousov wrote:
> > > > Author: kib
> > > > Date: Tue Nov 15 14:40:00 2011
> > > > New Revision: 227530
> > > > URL: http://svn.freebsd.org/changeset/base/227530
> > > >=20
> > > > Log:
> > > >   Update the device pager interface, while keeping the compatibility
> > > >   layer for old KPI and KBI.  New interface should be used together=
 with
> > > >   d_mmap_single cdevsw method.
> > > >  =20
> > > >   Device pager can be allocated with the cdev_pager_allocate(9)
> > > >   function, which takes struct cdev_pager_ops, containing
> > > >   constructor/destructor and page fault handler methods supplied by
> > > >   driver.
> > > >  =20
> > > >   Constructor and destructor, called at the pager allocation and
> > > >   deallocation time, allow the driver to handle per-object private =
data.
> > > >  =20
> > > >   The pager handler is called to handle page fault on the vm map en=
try
> > > >   backed by the driver pager. Driver shall return either the vm_pag=
e_t
> > > >   which should be mapped, or error code (which does not cause kernel
> > > >   panic anymore). The page handler interface has a placeholder to
> > > >   specify the access mode causing the fault, but currently PROT_REA=
D is
> > > >   always passed there.
> > > >  =20
> > > >   Sponsored by:	The FreeBSD Foundation
> > > >   Reviewed by:	alc
> > > >   MFC after:	1 month
> > >=20
> > > This is an old commit, but I believe there is a bug introduced by this
> > > commit that triggered the following panic:
> > >=20
> > > panic: dev_rel((null)) gave negative count
> > > cpuid =3D 5
> > > KDB: stack backtrace:
> > > db_trace_self_wrapper() at 0xffffffff803639da =3D db_trace_self_wrapp=
er+0x2a
> > > kdb_backtrace() at 0xffffffff8090e677 =3D kdb_backtrace+0x37
> > > panic() at 0xffffffff808d6bd8 =3D panic+0x1d8
> > > dev_rel() at 0xffffffff8088f929 =3D dev_rel+0xc9
> > > dev_pager_dealloc() at 0xffffffff80b2be30 =3D dev_pager_dealloc+0x30
> > > vm_object_terminate() at 0xffffffff80b44d7d =3D vm_object_terminate+0=
x13d
> > > vm_object_deallocate() at 0xffffffff80b465ca =3D vm_object_deallocate=
+0x17a
> > > cdev_pager_allocate() at 0xffffffff80b2c5e6 =3D cdev_pager_allocate+0=
x236
> > > dev_pager_alloc() at 0xffffffff80b2c697 =3D dev_pager_alloc+0x27
> > > vm_mmap_cdev() at 0xffffffff80b41b86 =3D vm_mmap_cdev+0x116
> > > vm_mmap() at 0xffffffff80b42221 =3D vm_mmap+0x671
> > > sys_mmap() at 0xffffffff80b428c6 =3D sys_mmap+0x1d6
> > > amd64_syscall() at 0xffffffff80bd1ed8 =3D amd64_syscall+0x318
> > > Xfast_syscall() at 0xffffffff80bbd387 =3D Xfast_syscall+0xf7
> > > --- syscall (477, FreeBSD ELF64, sys_mmap), rip =3D 0x8008ae25c, rsp =
=3D
> > > 0x7fffffffd968, rbp =3D 0 ---
> > > KDB: enter: panic
> > > [ thread pid 4109 tid 102348 ]
> > > Stopped at      0xffffffff8090e33b =3D kdb_enter+0x3b:    movq $0,0xb=
433e2(%rip)
> > >=20
> > > This is a stable/9 system, but the code is the same in head.
> > >=20
> > > I triggered it by running dmidecode (perhaps a few times) on a machin=
e with
> > > INVARIANTS enabled.  You can probably trigger it before too long with
> > > something like this:
> > >=20
> > > ((i=3D0)); while [ $i -lt 50 ]; do dmidecode > /dev/null & ((i++)); d=
one
> > >=20
> > > Run that loop until the system panics.
> > >=20
> > > The problem is that dev_pager_dealloc() passes in the VM object handl=
e to
> > > the cdev_pg_dtor() method (in this case old_dev_pager_dtor()), which
> > > assumes that it is a struct cdev.
> > >=20
> > > But there is a case in cdev_pager_allocate() where if a race conditio=
n is
> > > hit, the object handle is set to the object itself.  In that case, we=
 wind
> > > up triggering an assertion in dev_rel(), because it is not operating =
on a
> > > struct cdev, but rather a VM object.
> > >=20
> > > I've attached my local commit and diffs.  This does work, but you may
> > > have a better solution.
> > >=20
> > > Thanks,
> > >=20
> > > Ken
> > > --=20
> > > Kenneth Merry
> > > ken@FreeBSD.ORG
> >=20
> > > Change 649772 by kenm@ken.spectrabsd9 on 2013/01/08 14:54:45
> > >=20
> > > 	Fix a bug in the device pager code that can trigger an assertion
> > > 	in devfs if a particular race condition is hit in the device pager
> > > 	code.
> > > =09
> > > 	This fixes BUG25532.
> > > =09
> > > 	This was a side effect of FreeBSD SVN change 227530 (our Perforce
> > > 	change 513754), which changed the device pager interface to call
> > > 	a new destructor routine for the cdev.  That destructor routine,
> > > 	old_dev_pager_dtor(), takes a VM object handle.
> > > =09
> > > 	The object handle is cast to a struct cdev *, and passed into
> > > 	dev_rel().
> > > =09
> > > 	That works in most cases, except the case in cdev_pager_allocate()
> > > 	where there is a race condition between two threads allocating an
> > > 	object backed by the same device.  The loser of the race
> > > 	deallocates its object at the end of the function.
> > > =09
> > > 	The problem is that before inserting the object into the
> > > 	dev_pager_object_list, the object's handle is changed from the
> > > 	struct cdev pointer to the object's own address.  This is to avoid
> > > 	conflicts with the winner of the race, which already inserted an
> > > 	object in the list with a handle that is a pointer to the same cdev
> > > 	structure.
> > > =09
> > > 	The object is then passed to vm_object_deallocate(), and eventually
> > > 	makes its way down to old_dev_pager_dtor().  That function passes
> > > 	the handle pointer (which is actually a VM object, not a struct
> > > 	cdev as usual) into dev_rel().  dev_rel() decrements the reference
> > > 	count in the assumed struct cdev (which happens to be 0), and
> > > 	that triggers the assertion in dev_rel() that the reference count
> > > 	is greater than or equal to 0.
> > > =09
> > > 	The fix is to add a cdev pointer to the VM object, and use that
> > > 	pointer when calling the cdev_pg_dtor() routine.  There may be
> > > 	other, better, ways to fix this, so I'll ask for a review by the
> > > 	appropriate FreeBSD VM maintainers.
> > > =09
> > > 	vm_object.h:	Add a struct cdev pointer to the VM object
> > > 			structure.
> > > =09
> > > 	device_pager.c:	In cdev_pager_allocate(), populate the new cdev
> > > 			pointer.
> > > =09
> > > 			In dev_pager_dealloc(), use the new cdev pointer
> > > 			when calling the object's cdev_pg_dtor() routine.
> > > =09
> > > 	TeamTrack:	BUG25532
> > >=20
> > > Affected files ...
> > >=20
> > > ... //SpectraBSD/stable/sys/vm/device_pager.c#3 edit
> > > ... //SpectraBSD/stable/sys/vm/vm_object.h#3 edit
> > >=20
> > > Differences ...
> > >=20
> > > =3D=3D=3D=3D //SpectraBSD/stable/sys/vm/device_pager.c#3 (text) =3D=
=3D=3D=3D
> > >=20
> > > @@ -156,6 +156,7 @@
> > >  		object1->pg_color =3D color;
> > >  		object1->handle =3D handle;
> > >  		object1->un_pager.devp.ops =3D ops;
> > > +		object1->un_pager.devp.dev =3D handle;
> > >  		TAILQ_INIT(&object1->un_pager.devp.devp_pglist);
> > >  		mtx_lock(&dev_pager_mtx);
> > >  		object =3D vm_pager_object_lookup(&dev_pager_object_list, handle);
> > > @@ -233,7 +234,7 @@
> > >  	vm_page_t m;
> > > =20
> > >  	VM_OBJECT_UNLOCK(object);
> > > -	object->un_pager.devp.ops->cdev_pg_dtor(object->handle);
> > > +	object->un_pager.devp.ops->cdev_pg_dtor(object->un_pager.devp.dev);
> > > =20
> > >  	mtx_lock(&dev_pager_mtx);
> > >  	TAILQ_REMOVE(&dev_pager_object_list, object, pager_object_list);
> > >=20
> > > =3D=3D=3D=3D //SpectraBSD/stable/sys/vm/vm_object.h#3 (text) =3D=3D=
=3D=3D
> > >=20
> > > @@ -123,6 +123,7 @@
> > >  		struct {
> > >  			TAILQ_HEAD(, vm_page) devp_pglist;
> > >  			struct cdev_pager_ops *ops;
> > > +			struct cdev *dev;
> > >  		} devp;
> > > =20
> > >  		/*
> >=20
> > Hm, yes, I remembered this. I had the following patch sitting in my
> > local repository since the autumn, I think.
> >=20
> > Might be, your solution is more clean, but the main idea is the same,
> > to put the cdev pointer into the additional place.
>=20
> Okay, cool!
>=20
> Well, in that case could you pick the solution you prefer and commit it?

Please commit your change yourself. Note that merge to stable/9
would probably require moving the devp.dev member out of the union,
to the end of the structure, to preserve struct vm_object layout.

--bjuZg6miEcdLYP6q
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJQ7K+xAAoJEJDCuSvBvK1BSJoP/jkTe/G6aR7SZsuWCCrVVgbT
bm09+zkaGqFopO2p834eDxGi5RO7Or8rqwBZzt3q8kvhLV0OMkRrzN0CkIptfNsG
kfc4DgOotVyToeJgRyXD7D15OX4J6o2E+iGklFpntioDwfvJfJFYYZGrgBKtMsNY
UzGsRyZamzvRQtLtOXyj/j5B/fHDstI8Os3ktuMx8QihU0lbJy1I71w/0hTasHPR
fz/7OyUZG5O8aUJw/h163dop7nIfDVawJ4XzEGxsoYTKj+Zv9OvBGLTbVtOaJSOW
kAiFO+9bQ4O70S2Zjt3eYZckXbVE/AsKgIwMFQDq6qbFw4iPohGpM2y0Y1aHYGfo
6oz5v4qOiEWY382F2y+kfnGY31qTT9El3KEhmYR1TUHVoWl137q6/wbl2/vTtrf2
PssGqBptBOTNNS4H1gKUIUM0avXk8YR6v9pQRQFsT1kISMT223slse7Jb1AgZPWx
rC5PN1SY8ZLgQTfjXYDC/RJmfg8dLbQFQqkYUV31acWGFwcQKsSJjVDf6WNoYKpH
oOhU9Qgq6lXaxXmi3LeV5AWA1dliT9ATzaoMTRY3S6erE2P43NT4iIH5rbZBWPlA
ZKkuyTd+Abz1CU21Yo8bjMT2WhuIfgBlGGh/c1PDRV05OhCQcf8OIngwKG/+MMzh
MebKALEvzvughhQt6skv
=MFyj
-----END PGP SIGNATURE-----

--bjuZg6miEcdLYP6q--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130108234554.GD2561>