Date: Wed, 9 Jan 2013 00:49:31 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: "Kenneth D. Merry" <ken@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, alc@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r227530 - head/sys/vm Message-ID: <20130108224931.GC2561@kib.kiev.ua> In-Reply-To: <20130108221134.GA75088@nargothrond.kdm.org> References: <201111151440.pAFEe0xJ004636@svn.freebsd.org> <20130108221134.GA75088@nargothrond.kdm.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--7gGkHNMELEOhSGF6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jan 08, 2013 at 03:11:34PM -0700, Kenneth D. Merry wrote: > On Tue, Nov 15, 2011 at 14:40:00 +0000, Konstantin Belousov wrote: > > Author: kib > > Date: Tue Nov 15 14:40:00 2011 > > New Revision: 227530 > > URL: http://svn.freebsd.org/changeset/base/227530 > >=20 > > Log: > > Update the device pager interface, while keeping the compatibility > > layer for old KPI and KBI. New interface should be used together with > > d_mmap_single cdevsw method. > > =20 > > Device pager can be allocated with the cdev_pager_allocate(9) > > function, which takes struct cdev_pager_ops, containing > > constructor/destructor and page fault handler methods supplied by > > driver. > > =20 > > Constructor and destructor, called at the pager allocation and > > deallocation time, allow the driver to handle per-object private data. > > =20 > > The pager handler is called to handle page fault on the vm map entry > > backed by the driver pager. Driver shall return either the vm_page_t > > which should be mapped, or error code (which does not cause kernel > > panic anymore). The page handler interface has a placeholder to > > specify the access mode causing the fault, but currently PROT_READ is > > always passed there. > > =20 > > Sponsored by: The FreeBSD Foundation > > Reviewed by: alc > > MFC after: 1 month >=20 > This is an old commit, but I believe there is a bug introduced by this > commit that triggered the following panic: >=20 > panic: dev_rel((null)) gave negative count > cpuid =3D 5 > KDB: stack backtrace: > db_trace_self_wrapper() at 0xffffffff803639da =3D db_trace_self_wrapper+0= x2a > kdb_backtrace() at 0xffffffff8090e677 =3D kdb_backtrace+0x37 > panic() at 0xffffffff808d6bd8 =3D panic+0x1d8 > dev_rel() at 0xffffffff8088f929 =3D dev_rel+0xc9 > dev_pager_dealloc() at 0xffffffff80b2be30 =3D dev_pager_dealloc+0x30 > vm_object_terminate() at 0xffffffff80b44d7d =3D vm_object_terminate+0x13d > vm_object_deallocate() at 0xffffffff80b465ca =3D vm_object_deallocate+0x1= 7a > cdev_pager_allocate() at 0xffffffff80b2c5e6 =3D cdev_pager_allocate+0x236 > dev_pager_alloc() at 0xffffffff80b2c697 =3D dev_pager_alloc+0x27 > vm_mmap_cdev() at 0xffffffff80b41b86 =3D vm_mmap_cdev+0x116 > vm_mmap() at 0xffffffff80b42221 =3D vm_mmap+0x671 > sys_mmap() at 0xffffffff80b428c6 =3D sys_mmap+0x1d6 > amd64_syscall() at 0xffffffff80bd1ed8 =3D amd64_syscall+0x318 > Xfast_syscall() at 0xffffffff80bbd387 =3D Xfast_syscall+0xf7 > --- syscall (477, FreeBSD ELF64, sys_mmap), rip =3D 0x8008ae25c, rsp =3D > 0x7fffffffd968, rbp =3D 0 --- > KDB: enter: panic > [ thread pid 4109 tid 102348 ] > Stopped at 0xffffffff8090e33b =3D kdb_enter+0x3b: movq $0,0xb433e= 2(%rip) >=20 > This is a stable/9 system, but the code is the same in head. >=20 > I triggered it by running dmidecode (perhaps a few times) on a machine wi= th > INVARIANTS enabled. You can probably trigger it before too long with > something like this: >=20 > ((i=3D0)); while [ $i -lt 50 ]; do dmidecode > /dev/null & ((i++)); done >=20 > Run that loop until the system panics. >=20 > The problem is that dev_pager_dealloc() passes in the VM object handle to > the cdev_pg_dtor() method (in this case old_dev_pager_dtor()), which > assumes that it is a struct cdev. >=20 > But there is a case in cdev_pager_allocate() where if a race condition is > hit, the object handle is set to the object itself. In that case, we wind > up triggering an assertion in dev_rel(), because it is not operating on a > struct cdev, but rather a VM object. >=20 > I've attached my local commit and diffs. This does work, but you may > have a better solution. >=20 > Thanks, >=20 > Ken > --=20 > Kenneth Merry > ken@FreeBSD.ORG > Change 649772 by kenm@ken.spectrabsd9 on 2013/01/08 14:54:45 >=20 > Fix a bug in the device pager code that can trigger an assertion > in devfs if a particular race condition is hit in the device pager > code. > =09 > This fixes BUG25532. > =09 > This was a side effect of FreeBSD SVN change 227530 (our Perforce > change 513754), which changed the device pager interface to call > a new destructor routine for the cdev. That destructor routine, > old_dev_pager_dtor(), takes a VM object handle. > =09 > The object handle is cast to a struct cdev *, and passed into > dev_rel(). > =09 > That works in most cases, except the case in cdev_pager_allocate() > where there is a race condition between two threads allocating an > object backed by the same device. The loser of the race > deallocates its object at the end of the function. > =09 > The problem is that before inserting the object into the > dev_pager_object_list, the object's handle is changed from the > struct cdev pointer to the object's own address. This is to avoid > conflicts with the winner of the race, which already inserted an > object in the list with a handle that is a pointer to the same cdev > structure. > =09 > The object is then passed to vm_object_deallocate(), and eventually > makes its way down to old_dev_pager_dtor(). That function passes > the handle pointer (which is actually a VM object, not a struct > cdev as usual) into dev_rel(). dev_rel() decrements the reference > count in the assumed struct cdev (which happens to be 0), and > that triggers the assertion in dev_rel() that the reference count > is greater than or equal to 0. > =09 > The fix is to add a cdev pointer to the VM object, and use that > pointer when calling the cdev_pg_dtor() routine. There may be > other, better, ways to fix this, so I'll ask for a review by the > appropriate FreeBSD VM maintainers. > =09 > vm_object.h: Add a struct cdev pointer to the VM object > structure. > =09 > device_pager.c: In cdev_pager_allocate(), populate the new cdev > pointer. > =09 > In dev_pager_dealloc(), use the new cdev pointer > when calling the object's cdev_pg_dtor() routine. > =09 > TeamTrack: BUG25532 >=20 > Affected files ... >=20 > ... //SpectraBSD/stable/sys/vm/device_pager.c#3 edit > ... //SpectraBSD/stable/sys/vm/vm_object.h#3 edit >=20 > Differences ... >=20 > =3D=3D=3D=3D //SpectraBSD/stable/sys/vm/device_pager.c#3 (text) =3D=3D=3D= =3D >=20 > @@ -156,6 +156,7 @@ > object1->pg_color =3D color; > object1->handle =3D handle; > object1->un_pager.devp.ops =3D ops; > + object1->un_pager.devp.dev =3D handle; > TAILQ_INIT(&object1->un_pager.devp.devp_pglist); > mtx_lock(&dev_pager_mtx); > object =3D vm_pager_object_lookup(&dev_pager_object_list, handle); > @@ -233,7 +234,7 @@ > vm_page_t m; > =20 > VM_OBJECT_UNLOCK(object); > - object->un_pager.devp.ops->cdev_pg_dtor(object->handle); > + object->un_pager.devp.ops->cdev_pg_dtor(object->un_pager.devp.dev); > =20 > mtx_lock(&dev_pager_mtx); > TAILQ_REMOVE(&dev_pager_object_list, object, pager_object_list); >=20 > =3D=3D=3D=3D //SpectraBSD/stable/sys/vm/vm_object.h#3 (text) =3D=3D=3D=3D >=20 > @@ -123,6 +123,7 @@ > struct { > TAILQ_HEAD(, vm_page) devp_pglist; > struct cdev_pager_ops *ops; > + struct cdev *dev; > } devp; > =20 > /* Hm, yes, I remembered this. I had the following patch sitting in my local repository since the autumn, I think. Might be, your solution is more clean, but the main idea is the same, to put the cdev pointer into the additional place. diff --git a/sys/vm/device_pager.c b/sys/vm/device_pager.c index ba70571..538c887 100644 --- a/sys/vm/device_pager.c +++ b/sys/vm/device_pager.c @@ -183,6 +183,7 @@ cdev_pager_allocate(void *handle, enum obj_type tp, str= uct cdev_pager_ops *ops, mtx_unlock(&dev_pager_mtx); if (object1 !=3D NULL) { object1->handle =3D object1; + object1->un_pager.devp.bhandle =3D handle; mtx_lock(&dev_pager_mtx); TAILQ_INSERT_TAIL(&dev_pager_object_list, object1, pager_object_list); @@ -233,13 +234,15 @@ dev_pager_dealloc(object) vm_object_t object; { vm_page_t m; + void *h; =20 VM_OBJECT_UNLOCK(object); - object->un_pager.devp.ops->cdev_pg_dtor(object->handle); - mtx_lock(&dev_pager_mtx); TAILQ_REMOVE(&dev_pager_object_list, object, pager_object_list); mtx_unlock(&dev_pager_mtx); + h =3D object->handle =3D=3D object ? object->un_pager.devp.bhandle : + object->handle; + object->un_pager.devp.ops->cdev_pg_dtor(h); VM_OBJECT_LOCK(object); =20 if (object->type =3D=3D OBJT_DEVICE) { diff --git a/sys/vm/vm_object.h b/sys/vm/vm_object.h index b584239..6a39e18 100644 --- a/sys/vm/vm_object.h +++ b/sys/vm/vm_object.h @@ -136,6 +136,7 @@ struct vm_object { struct { TAILQ_HEAD(, vm_page) devp_pglist; struct cdev_pager_ops *ops; + void *bhandle; } devp; =20 /* --7gGkHNMELEOhSGF6 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJQ7KJ6AAoJEJDCuSvBvK1B6/MP/RCk//QYDfZ/YYmB2auL4xJc Sb2roVmbCoAs4rUnytaogkRZ/txEK/h1dk0E/5H3LG6bgM5xe0kUWNDK+r7VtIl1 0V8felbcOR4iQaw9m78F/iS1TCKX0ycT5YNyhUoeC0jbDzXK/uncbNUUm2BTp85e ugZXEJ7uzXO1CoNvWBg8oELdlDIPOFw22RMeAUxS72xOOdFMi3U6lACXLlC+6z3z 3jBN886VJRFCHiPHMM+kNJAZeFTrm3oqWm+7J/ANmqXzhpzLJ7lU6+lTxH4LhJFa jZEwzLGG0Mwghd6CCYn2OyNEdzw5JBq7AJyLq0OoW7pDZ/Vr69/Rre+bulBXPqSq srhKtlbyrF+TzJdzzrFsqF1QNiyDAuaBlqQ8WFDxmj9TxAenuLEXrwKtNrprYvdh k/BrBeRqaFZD+m9nK9Hlfr9OJDx0KyLxS6YiFYzyueaZ2hTvHkyb2YBESk6nzAB9 7TyrbQ2WgZFFozKS3YAobWXNXo6VnNMZF166A9WGo6g9fzrBMBIjEnD47RD/0k7R EZYq1iAqETePvC5wNcwy1EEGZLj5OLvp3+ZMCCfefiqxXA6gczcmhYQx87S6EtKK b1usG3mgWaFon5fb+378mzRiF2eqZ30knFbpT/qzOquChqN+Rq+cTPp4uilMWxX3 YiD3j26Wm5iLGHeOkZtD =6B2x -----END PGP SIGNATURE----- --7gGkHNMELEOhSGF6--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130108224931.GC2561>