Date: Wed, 9 Jan 2013 01:45:54 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: "Kenneth D. Merry" <ken@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, alc@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r227530 - head/sys/vm Message-ID: <20130108234554.GD2561@kib.kiev.ua> In-Reply-To: <20130108230311.GA77847@nargothrond.kdm.org> References: <201111151440.pAFEe0xJ004636@svn.freebsd.org> <20130108221134.GA75088@nargothrond.kdm.org> <20130108224931.GC2561@kib.kiev.ua> <20130108230311.GA77847@nargothrond.kdm.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--bjuZg6miEcdLYP6q Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jan 08, 2013 at 04:03:11PM -0700, Kenneth D. Merry wrote: > On Wed, Jan 09, 2013 at 00:49:31 +0200, Konstantin Belousov wrote: > > On Tue, Jan 08, 2013 at 03:11:34PM -0700, Kenneth D. Merry wrote: > > > On Tue, Nov 15, 2011 at 14:40:00 +0000, Konstantin Belousov wrote: > > > > Author: kib > > > > Date: Tue Nov 15 14:40:00 2011 > > > > New Revision: 227530 > > > > URL: http://svn.freebsd.org/changeset/base/227530 > > > >=20 > > > > Log: > > > > Update the device pager interface, while keeping the compatibility > > > > layer for old KPI and KBI. New interface should be used together= with > > > > d_mmap_single cdevsw method. > > > > =20 > > > > Device pager can be allocated with the cdev_pager_allocate(9) > > > > function, which takes struct cdev_pager_ops, containing > > > > constructor/destructor and page fault handler methods supplied by > > > > driver. > > > > =20 > > > > Constructor and destructor, called at the pager allocation and > > > > deallocation time, allow the driver to handle per-object private = data. > > > > =20 > > > > The pager handler is called to handle page fault on the vm map en= try > > > > backed by the driver pager. Driver shall return either the vm_pag= e_t > > > > which should be mapped, or error code (which does not cause kernel > > > > panic anymore). The page handler interface has a placeholder to > > > > specify the access mode causing the fault, but currently PROT_REA= D is > > > > always passed there. > > > > =20 > > > > Sponsored by: The FreeBSD Foundation > > > > Reviewed by: alc > > > > MFC after: 1 month > > >=20 > > > This is an old commit, but I believe there is a bug introduced by this > > > commit that triggered the following panic: > > >=20 > > > panic: dev_rel((null)) gave negative count > > > cpuid =3D 5 > > > KDB: stack backtrace: > > > db_trace_self_wrapper() at 0xffffffff803639da =3D db_trace_self_wrapp= er+0x2a > > > kdb_backtrace() at 0xffffffff8090e677 =3D kdb_backtrace+0x37 > > > panic() at 0xffffffff808d6bd8 =3D panic+0x1d8 > > > dev_rel() at 0xffffffff8088f929 =3D dev_rel+0xc9 > > > dev_pager_dealloc() at 0xffffffff80b2be30 =3D dev_pager_dealloc+0x30 > > > vm_object_terminate() at 0xffffffff80b44d7d =3D vm_object_terminate+0= x13d > > > vm_object_deallocate() at 0xffffffff80b465ca =3D vm_object_deallocate= +0x17a > > > cdev_pager_allocate() at 0xffffffff80b2c5e6 =3D cdev_pager_allocate+0= x236 > > > dev_pager_alloc() at 0xffffffff80b2c697 =3D dev_pager_alloc+0x27 > > > vm_mmap_cdev() at 0xffffffff80b41b86 =3D vm_mmap_cdev+0x116 > > > vm_mmap() at 0xffffffff80b42221 =3D vm_mmap+0x671 > > > sys_mmap() at 0xffffffff80b428c6 =3D sys_mmap+0x1d6 > > > amd64_syscall() at 0xffffffff80bd1ed8 =3D amd64_syscall+0x318 > > > Xfast_syscall() at 0xffffffff80bbd387 =3D Xfast_syscall+0xf7 > > > --- syscall (477, FreeBSD ELF64, sys_mmap), rip =3D 0x8008ae25c, rsp = =3D > > > 0x7fffffffd968, rbp =3D 0 --- > > > KDB: enter: panic > > > [ thread pid 4109 tid 102348 ] > > > Stopped at 0xffffffff8090e33b =3D kdb_enter+0x3b: movq $0,0xb= 433e2(%rip) > > >=20 > > > This is a stable/9 system, but the code is the same in head. > > >=20 > > > I triggered it by running dmidecode (perhaps a few times) on a machin= e with > > > INVARIANTS enabled. You can probably trigger it before too long with > > > something like this: > > >=20 > > > ((i=3D0)); while [ $i -lt 50 ]; do dmidecode > /dev/null & ((i++)); d= one > > >=20 > > > Run that loop until the system panics. > > >=20 > > > The problem is that dev_pager_dealloc() passes in the VM object handl= e to > > > the cdev_pg_dtor() method (in this case old_dev_pager_dtor()), which > > > assumes that it is a struct cdev. > > >=20 > > > But there is a case in cdev_pager_allocate() where if a race conditio= n is > > > hit, the object handle is set to the object itself. In that case, we= wind > > > up triggering an assertion in dev_rel(), because it is not operating = on a > > > struct cdev, but rather a VM object. > > >=20 > > > I've attached my local commit and diffs. This does work, but you may > > > have a better solution. > > >=20 > > > Thanks, > > >=20 > > > Ken > > > --=20 > > > Kenneth Merry > > > ken@FreeBSD.ORG > >=20 > > > Change 649772 by kenm@ken.spectrabsd9 on 2013/01/08 14:54:45 > > >=20 > > > Fix a bug in the device pager code that can trigger an assertion > > > in devfs if a particular race condition is hit in the device pager > > > code. > > > =09 > > > This fixes BUG25532. > > > =09 > > > This was a side effect of FreeBSD SVN change 227530 (our Perforce > > > change 513754), which changed the device pager interface to call > > > a new destructor routine for the cdev. That destructor routine, > > > old_dev_pager_dtor(), takes a VM object handle. > > > =09 > > > The object handle is cast to a struct cdev *, and passed into > > > dev_rel(). > > > =09 > > > That works in most cases, except the case in cdev_pager_allocate() > > > where there is a race condition between two threads allocating an > > > object backed by the same device. The loser of the race > > > deallocates its object at the end of the function. > > > =09 > > > The problem is that before inserting the object into the > > > dev_pager_object_list, the object's handle is changed from the > > > struct cdev pointer to the object's own address. This is to avoid > > > conflicts with the winner of the race, which already inserted an > > > object in the list with a handle that is a pointer to the same cdev > > > structure. > > > =09 > > > The object is then passed to vm_object_deallocate(), and eventually > > > makes its way down to old_dev_pager_dtor(). That function passes > > > the handle pointer (which is actually a VM object, not a struct > > > cdev as usual) into dev_rel(). dev_rel() decrements the reference > > > count in the assumed struct cdev (which happens to be 0), and > > > that triggers the assertion in dev_rel() that the reference count > > > is greater than or equal to 0. > > > =09 > > > The fix is to add a cdev pointer to the VM object, and use that > > > pointer when calling the cdev_pg_dtor() routine. There may be > > > other, better, ways to fix this, so I'll ask for a review by the > > > appropriate FreeBSD VM maintainers. > > > =09 > > > vm_object.h: Add a struct cdev pointer to the VM object > > > structure. > > > =09 > > > device_pager.c: In cdev_pager_allocate(), populate the new cdev > > > pointer. > > > =09 > > > In dev_pager_dealloc(), use the new cdev pointer > > > when calling the object's cdev_pg_dtor() routine. > > > =09 > > > TeamTrack: BUG25532 > > >=20 > > > Affected files ... > > >=20 > > > ... //SpectraBSD/stable/sys/vm/device_pager.c#3 edit > > > ... //SpectraBSD/stable/sys/vm/vm_object.h#3 edit > > >=20 > > > Differences ... > > >=20 > > > =3D=3D=3D=3D //SpectraBSD/stable/sys/vm/device_pager.c#3 (text) =3D= =3D=3D=3D > > >=20 > > > @@ -156,6 +156,7 @@ > > > object1->pg_color =3D color; > > > object1->handle =3D handle; > > > object1->un_pager.devp.ops =3D ops; > > > + object1->un_pager.devp.dev =3D handle; > > > TAILQ_INIT(&object1->un_pager.devp.devp_pglist); > > > mtx_lock(&dev_pager_mtx); > > > object =3D vm_pager_object_lookup(&dev_pager_object_list, handle); > > > @@ -233,7 +234,7 @@ > > > vm_page_t m; > > > =20 > > > VM_OBJECT_UNLOCK(object); > > > - object->un_pager.devp.ops->cdev_pg_dtor(object->handle); > > > + object->un_pager.devp.ops->cdev_pg_dtor(object->un_pager.devp.dev); > > > =20 > > > mtx_lock(&dev_pager_mtx); > > > TAILQ_REMOVE(&dev_pager_object_list, object, pager_object_list); > > >=20 > > > =3D=3D=3D=3D //SpectraBSD/stable/sys/vm/vm_object.h#3 (text) =3D=3D= =3D=3D > > >=20 > > > @@ -123,6 +123,7 @@ > > > struct { > > > TAILQ_HEAD(, vm_page) devp_pglist; > > > struct cdev_pager_ops *ops; > > > + struct cdev *dev; > > > } devp; > > > =20 > > > /* > >=20 > > Hm, yes, I remembered this. I had the following patch sitting in my > > local repository since the autumn, I think. > >=20 > > Might be, your solution is more clean, but the main idea is the same, > > to put the cdev pointer into the additional place. >=20 > Okay, cool! >=20 > Well, in that case could you pick the solution you prefer and commit it? Please commit your change yourself. Note that merge to stable/9 would probably require moving the devp.dev member out of the union, to the end of the structure, to preserve struct vm_object layout. --bjuZg6miEcdLYP6q Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJQ7K+xAAoJEJDCuSvBvK1BSJoP/jkTe/G6aR7SZsuWCCrVVgbT bm09+zkaGqFopO2p834eDxGi5RO7Or8rqwBZzt3q8kvhLV0OMkRrzN0CkIptfNsG kfc4DgOotVyToeJgRyXD7D15OX4J6o2E+iGklFpntioDwfvJfJFYYZGrgBKtMsNY UzGsRyZamzvRQtLtOXyj/j5B/fHDstI8Os3ktuMx8QihU0lbJy1I71w/0hTasHPR fz/7OyUZG5O8aUJw/h163dop7nIfDVawJ4XzEGxsoYTKj+Zv9OvBGLTbVtOaJSOW kAiFO+9bQ4O70S2Zjt3eYZckXbVE/AsKgIwMFQDq6qbFw4iPohGpM2y0Y1aHYGfo 6oz5v4qOiEWY382F2y+kfnGY31qTT9El3KEhmYR1TUHVoWl137q6/wbl2/vTtrf2 PssGqBptBOTNNS4H1gKUIUM0avXk8YR6v9pQRQFsT1kISMT223slse7Jb1AgZPWx rC5PN1SY8ZLgQTfjXYDC/RJmfg8dLbQFQqkYUV31acWGFwcQKsSJjVDf6WNoYKpH oOhU9Qgq6lXaxXmi3LeV5AWA1dliT9ATzaoMTRY3S6erE2P43NT4iIH5rbZBWPlA ZKkuyTd+Abz1CU21Yo8bjMT2WhuIfgBlGGh/c1PDRV05OhCQcf8OIngwKG/+MMzh MebKALEvzvughhQt6skv =MFyj -----END PGP SIGNATURE----- --bjuZg6miEcdLYP6q--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130108234554.GD2561>