Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Jan 2013 15:11:34 -0700
From:      "Kenneth D. Merry" <ken@FreeBSD.org>
To:        Konstantin Belousov <kib@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, alc@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: svn commit: r227530 - head/sys/vm
Message-ID:  <20130108221134.GA75088@nargothrond.kdm.org>
In-Reply-To: <201111151440.pAFEe0xJ004636@svn.freebsd.org>
References:  <201111151440.pAFEe0xJ004636@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--5mCyUwZo2JvN/JJP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Tue, Nov 15, 2011 at 14:40:00 +0000, Konstantin Belousov wrote:
> Author: kib
> Date: Tue Nov 15 14:40:00 2011
> New Revision: 227530
> URL: http://svn.freebsd.org/changeset/base/227530
> 
> Log:
>   Update the device pager interface, while keeping the compatibility
>   layer for old KPI and KBI.  New interface should be used together with
>   d_mmap_single cdevsw method.
>   
>   Device pager can be allocated with the cdev_pager_allocate(9)
>   function, which takes struct cdev_pager_ops, containing
>   constructor/destructor and page fault handler methods supplied by
>   driver.
>   
>   Constructor and destructor, called at the pager allocation and
>   deallocation time, allow the driver to handle per-object private data.
>   
>   The pager handler is called to handle page fault on the vm map entry
>   backed by the driver pager. Driver shall return either the vm_page_t
>   which should be mapped, or error code (which does not cause kernel
>   panic anymore). The page handler interface has a placeholder to
>   specify the access mode causing the fault, but currently PROT_READ is
>   always passed there.
>   
>   Sponsored by:	The FreeBSD Foundation
>   Reviewed by:	alc
>   MFC after:	1 month

This is an old commit, but I believe there is a bug introduced by this
commit that triggered the following panic:

panic: dev_rel((null)) gave negative count
cpuid = 5
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff803639da = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0xffffffff8090e677 = kdb_backtrace+0x37
panic() at 0xffffffff808d6bd8 = panic+0x1d8
dev_rel() at 0xffffffff8088f929 = dev_rel+0xc9
dev_pager_dealloc() at 0xffffffff80b2be30 = dev_pager_dealloc+0x30
vm_object_terminate() at 0xffffffff80b44d7d = vm_object_terminate+0x13d
vm_object_deallocate() at 0xffffffff80b465ca = vm_object_deallocate+0x17a
cdev_pager_allocate() at 0xffffffff80b2c5e6 = cdev_pager_allocate+0x236
dev_pager_alloc() at 0xffffffff80b2c697 = dev_pager_alloc+0x27
vm_mmap_cdev() at 0xffffffff80b41b86 = vm_mmap_cdev+0x116
vm_mmap() at 0xffffffff80b42221 = vm_mmap+0x671
sys_mmap() at 0xffffffff80b428c6 = sys_mmap+0x1d6
amd64_syscall() at 0xffffffff80bd1ed8 = amd64_syscall+0x318
Xfast_syscall() at 0xffffffff80bbd387 = Xfast_syscall+0xf7
--- syscall (477, FreeBSD ELF64, sys_mmap), rip = 0x8008ae25c, rsp =
0x7fffffffd968, rbp = 0 ---
KDB: enter: panic
[ thread pid 4109 tid 102348 ]
Stopped at      0xffffffff8090e33b = kdb_enter+0x3b:    movq $0,0xb433e2(%rip)

This is a stable/9 system, but the code is the same in head.

I triggered it by running dmidecode (perhaps a few times) on a machine with
INVARIANTS enabled.  You can probably trigger it before too long with
something like this:

((i=0)); while [ $i -lt 50 ]; do dmidecode > /dev/null & ((i++)); done

Run that loop until the system panics.

The problem is that dev_pager_dealloc() passes in the VM object handle to
the cdev_pg_dtor() method (in this case old_dev_pager_dtor()), which
assumes that it is a struct cdev.

But there is a case in cdev_pager_allocate() where if a race condition is
hit, the object handle is set to the object itself.  In that case, we wind
up triggering an assertion in dev_rel(), because it is not operating on a
struct cdev, but rather a VM object.

I've attached my local commit and diffs.  This does work, but you may
have a better solution.

Thanks,

Ken
-- 
Kenneth Merry
ken@FreeBSD.ORG

--5mCyUwZo2JvN/JJP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="device_pager_dtor.20130108.1.txt"

Change 649772 by kenm@ken.spectrabsd9 on 2013/01/08 14:54:45

	Fix a bug in the device pager code that can trigger an assertion
	in devfs if a particular race condition is hit in the device pager
	code.
	
	This fixes BUG25532.
	
	This was a side effect of FreeBSD SVN change 227530 (our Perforce
	change 513754), which changed the device pager interface to call
	a new destructor routine for the cdev.  That destructor routine,
	old_dev_pager_dtor(), takes a VM object handle.
	
	The object handle is cast to a struct cdev *, and passed into
	dev_rel().
	
	That works in most cases, except the case in cdev_pager_allocate()
	where there is a race condition between two threads allocating an
	object backed by the same device.  The loser of the race
	deallocates its object at the end of the function.
	
	The problem is that before inserting the object into the
	dev_pager_object_list, the object's handle is changed from the
	struct cdev pointer to the object's own address.  This is to avoid
	conflicts with the winner of the race, which already inserted an
	object in the list with a handle that is a pointer to the same cdev
	structure.
	
	The object is then passed to vm_object_deallocate(), and eventually
	makes its way down to old_dev_pager_dtor().  That function passes
	the handle pointer (which is actually a VM object, not a struct
	cdev as usual) into dev_rel().  dev_rel() decrements the reference
	count in the assumed struct cdev (which happens to be 0), and
	that triggers the assertion in dev_rel() that the reference count
	is greater than or equal to 0.
	
	The fix is to add a cdev pointer to the VM object, and use that
	pointer when calling the cdev_pg_dtor() routine.  There may be
	other, better, ways to fix this, so I'll ask for a review by the
	appropriate FreeBSD VM maintainers.
	
	vm_object.h:	Add a struct cdev pointer to the VM object
			structure.
	
	device_pager.c:	In cdev_pager_allocate(), populate the new cdev
			pointer.
	
			In dev_pager_dealloc(), use the new cdev pointer
			when calling the object's cdev_pg_dtor() routine.
	
	TeamTrack:	BUG25532

Affected files ...

... //SpectraBSD/stable/sys/vm/device_pager.c#3 edit
... //SpectraBSD/stable/sys/vm/vm_object.h#3 edit

Differences ...

==== //SpectraBSD/stable/sys/vm/device_pager.c#3 (text) ====

@@ -156,6 +156,7 @@
 		object1->pg_color = color;
 		object1->handle = handle;
 		object1->un_pager.devp.ops = ops;
+		object1->un_pager.devp.dev = handle;
 		TAILQ_INIT(&object1->un_pager.devp.devp_pglist);
 		mtx_lock(&dev_pager_mtx);
 		object = vm_pager_object_lookup(&dev_pager_object_list, handle);
@@ -233,7 +234,7 @@
 	vm_page_t m;
 
 	VM_OBJECT_UNLOCK(object);
-	object->un_pager.devp.ops->cdev_pg_dtor(object->handle);
+	object->un_pager.devp.ops->cdev_pg_dtor(object->un_pager.devp.dev);
 
 	mtx_lock(&dev_pager_mtx);
 	TAILQ_REMOVE(&dev_pager_object_list, object, pager_object_list);

==== //SpectraBSD/stable/sys/vm/vm_object.h#3 (text) ====

@@ -123,6 +123,7 @@
 		struct {
 			TAILQ_HEAD(, vm_page) devp_pglist;
 			struct cdev_pager_ops *ops;
+			struct cdev *dev;
 		} devp;
 
 		/*

--5mCyUwZo2JvN/JJP--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130108221134.GA75088>