Date: Tue, 8 Jan 2013 15:11:34 -0700 From: "Kenneth D. Merry" <ken@FreeBSD.org> To: Konstantin Belousov <kib@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, alc@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r227530 - head/sys/vm Message-ID: <20130108221134.GA75088@nargothrond.kdm.org> In-Reply-To: <201111151440.pAFEe0xJ004636@svn.freebsd.org> References: <201111151440.pAFEe0xJ004636@svn.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--5mCyUwZo2JvN/JJP Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Nov 15, 2011 at 14:40:00 +0000, Konstantin Belousov wrote: > Author: kib > Date: Tue Nov 15 14:40:00 2011 > New Revision: 227530 > URL: http://svn.freebsd.org/changeset/base/227530 > > Log: > Update the device pager interface, while keeping the compatibility > layer for old KPI and KBI. New interface should be used together with > d_mmap_single cdevsw method. > > Device pager can be allocated with the cdev_pager_allocate(9) > function, which takes struct cdev_pager_ops, containing > constructor/destructor and page fault handler methods supplied by > driver. > > Constructor and destructor, called at the pager allocation and > deallocation time, allow the driver to handle per-object private data. > > The pager handler is called to handle page fault on the vm map entry > backed by the driver pager. Driver shall return either the vm_page_t > which should be mapped, or error code (which does not cause kernel > panic anymore). The page handler interface has a placeholder to > specify the access mode causing the fault, but currently PROT_READ is > always passed there. > > Sponsored by: The FreeBSD Foundation > Reviewed by: alc > MFC after: 1 month This is an old commit, but I believe there is a bug introduced by this commit that triggered the following panic: panic: dev_rel((null)) gave negative count cpuid = 5 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff803639da = db_trace_self_wrapper+0x2a kdb_backtrace() at 0xffffffff8090e677 = kdb_backtrace+0x37 panic() at 0xffffffff808d6bd8 = panic+0x1d8 dev_rel() at 0xffffffff8088f929 = dev_rel+0xc9 dev_pager_dealloc() at 0xffffffff80b2be30 = dev_pager_dealloc+0x30 vm_object_terminate() at 0xffffffff80b44d7d = vm_object_terminate+0x13d vm_object_deallocate() at 0xffffffff80b465ca = vm_object_deallocate+0x17a cdev_pager_allocate() at 0xffffffff80b2c5e6 = cdev_pager_allocate+0x236 dev_pager_alloc() at 0xffffffff80b2c697 = dev_pager_alloc+0x27 vm_mmap_cdev() at 0xffffffff80b41b86 = vm_mmap_cdev+0x116 vm_mmap() at 0xffffffff80b42221 = vm_mmap+0x671 sys_mmap() at 0xffffffff80b428c6 = sys_mmap+0x1d6 amd64_syscall() at 0xffffffff80bd1ed8 = amd64_syscall+0x318 Xfast_syscall() at 0xffffffff80bbd387 = Xfast_syscall+0xf7 --- syscall (477, FreeBSD ELF64, sys_mmap), rip = 0x8008ae25c, rsp = 0x7fffffffd968, rbp = 0 --- KDB: enter: panic [ thread pid 4109 tid 102348 ] Stopped at 0xffffffff8090e33b = kdb_enter+0x3b: movq $0,0xb433e2(%rip) This is a stable/9 system, but the code is the same in head. I triggered it by running dmidecode (perhaps a few times) on a machine with INVARIANTS enabled. You can probably trigger it before too long with something like this: ((i=0)); while [ $i -lt 50 ]; do dmidecode > /dev/null & ((i++)); done Run that loop until the system panics. The problem is that dev_pager_dealloc() passes in the VM object handle to the cdev_pg_dtor() method (in this case old_dev_pager_dtor()), which assumes that it is a struct cdev. But there is a case in cdev_pager_allocate() where if a race condition is hit, the object handle is set to the object itself. In that case, we wind up triggering an assertion in dev_rel(), because it is not operating on a struct cdev, but rather a VM object. I've attached my local commit and diffs. This does work, but you may have a better solution. Thanks, Ken -- Kenneth Merry ken@FreeBSD.ORG --5mCyUwZo2JvN/JJP Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="device_pager_dtor.20130108.1.txt" Change 649772 by kenm@ken.spectrabsd9 on 2013/01/08 14:54:45 Fix a bug in the device pager code that can trigger an assertion in devfs if a particular race condition is hit in the device pager code. This fixes BUG25532. This was a side effect of FreeBSD SVN change 227530 (our Perforce change 513754), which changed the device pager interface to call a new destructor routine for the cdev. That destructor routine, old_dev_pager_dtor(), takes a VM object handle. The object handle is cast to a struct cdev *, and passed into dev_rel(). That works in most cases, except the case in cdev_pager_allocate() where there is a race condition between two threads allocating an object backed by the same device. The loser of the race deallocates its object at the end of the function. The problem is that before inserting the object into the dev_pager_object_list, the object's handle is changed from the struct cdev pointer to the object's own address. This is to avoid conflicts with the winner of the race, which already inserted an object in the list with a handle that is a pointer to the same cdev structure. The object is then passed to vm_object_deallocate(), and eventually makes its way down to old_dev_pager_dtor(). That function passes the handle pointer (which is actually a VM object, not a struct cdev as usual) into dev_rel(). dev_rel() decrements the reference count in the assumed struct cdev (which happens to be 0), and that triggers the assertion in dev_rel() that the reference count is greater than or equal to 0. The fix is to add a cdev pointer to the VM object, and use that pointer when calling the cdev_pg_dtor() routine. There may be other, better, ways to fix this, so I'll ask for a review by the appropriate FreeBSD VM maintainers. vm_object.h: Add a struct cdev pointer to the VM object structure. device_pager.c: In cdev_pager_allocate(), populate the new cdev pointer. In dev_pager_dealloc(), use the new cdev pointer when calling the object's cdev_pg_dtor() routine. TeamTrack: BUG25532 Affected files ... ... //SpectraBSD/stable/sys/vm/device_pager.c#3 edit ... //SpectraBSD/stable/sys/vm/vm_object.h#3 edit Differences ... ==== //SpectraBSD/stable/sys/vm/device_pager.c#3 (text) ==== @@ -156,6 +156,7 @@ object1->pg_color = color; object1->handle = handle; object1->un_pager.devp.ops = ops; + object1->un_pager.devp.dev = handle; TAILQ_INIT(&object1->un_pager.devp.devp_pglist); mtx_lock(&dev_pager_mtx); object = vm_pager_object_lookup(&dev_pager_object_list, handle); @@ -233,7 +234,7 @@ vm_page_t m; VM_OBJECT_UNLOCK(object); - object->un_pager.devp.ops->cdev_pg_dtor(object->handle); + object->un_pager.devp.ops->cdev_pg_dtor(object->un_pager.devp.dev); mtx_lock(&dev_pager_mtx); TAILQ_REMOVE(&dev_pager_object_list, object, pager_object_list); ==== //SpectraBSD/stable/sys/vm/vm_object.h#3 (text) ==== @@ -123,6 +123,7 @@ struct { TAILQ_HEAD(, vm_page) devp_pglist; struct cdev_pager_ops *ops; + struct cdev *dev; } devp; /* --5mCyUwZo2JvN/JJP--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130108221134.GA75088>