Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Jun 2015 13:01:17 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Dmitry Sivachenko <trtrmitya@gmail.com>
Cc:        FreeBSD Stable ML <stable@freebsd.org>, alc@freebsd.org
Subject:   Re: panic: wm_page_unwire
Message-ID:  <20150620100116.GU2080@kib.kiev.ua>
In-Reply-To: <60FB4B9C-CC80-4269-8C94-F9DE3D98EE0D@gmail.com>
References:  <8436D969-5AF2-4189-A509-B44669906AEB@gmail.com> <60FB4B9C-CC80-4269-8C94-F9DE3D98EE0D@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jun 20, 2015 at 10:23:39AM +0300, Dmitry Sivachenko wrote:
> 
> > On 19 ÉÀÎÑ 2015 Ç., at 22:57, Dmitry Sivachenko <trtrmitya@gmail.com> wrote:
> > 
> > Hello,
> > 
> > got this panic today on my 10.1-STABLE #0 r279956  box:
> > 
> > <Screen Shot 2015-06-19 at 22.52.57.png>
> 
> 
> Well, I tracked this down a bit.  Rather easy way to panic -stable box (mine is r279956), but I can't reliably reproduce this.
> 
> It happens when there is a process running which mmap()+mlock() some file, and while it is running this file is modified on disk
> (not rm+mv, but open the same file, truncate and write some other data into it).
> 
> After process exits, system will panic with high probability.
> 
> So far I got 2 cases:
> 
> 1) run process which mlock()'s a file;  modify that file;  stop process and system panics
> 2) run process which mlock()'s a file;  modify that file;  stop process [no panic so far];  modify that file again and system panics.
> 
> Panic message is the same: panic: vm_page_unwire: page <xxxx>'s wire count is zero

I was able to reproduce something related, this may be very well your
problem.  Take the attached program.  Select a scratch file on UFS mount
point, say x.  Run the following commands:
mlock_modify x&
dd if=/dev/zero of=x bs=1 count=1
fg
^C <- system might panic at this point, if buffers are in short supply
dd if=/dev/zero of=x bs=1 count=1 <- at this point, the system must panic

The issue apparently is the following:
we have a wired shared mapping backed by a vnode, and the vnode is
truncated, so that the mapped pages are removed from the vnode' object
[*]. But, some other pages are inserted into the object at the same
position to hold newly written data. Then, when the region is unwired
during unmap, the vm_object_unwire() blindly unwires whatever pages
belong to the object at the mapped range, without checking that the
pages are indeed wired mapped there.  Depending on whether the buffer
for the page still exists when unwire is done, the panic would occur
in the first or in the second place.

* In fact, the pages are not removed immediately by
vnode_pager_setsize()->vm_object_page_remove(), since the pages are
still wired by buffer cache. But truncation does brelse(), which removes
last wire count and vfs_vmio_release() correctly frees the pages.

I do not see any other solution than to allow vm_object_unwire() to
see if the region still maps the page we found through the object'
backing chain walk.  This makes the vm_object_unwire() name and interface
somewhat strange for vm_object.c.

diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c
index 12ebf5d..61ea796 100644
--- a/sys/vm/vm_map.c
+++ b/sys/vm/vm_map.c
@@ -2459,8 +2459,8 @@ vm_map_wire_entry_failure(vm_map_t map, vm_map_entry_t entry,
 	 */
 	if (failed_addr > entry->start) {
 		pmap_unwire(map->pmap, entry->start, failed_addr);
-		vm_object_unwire(entry->object.vm_object, entry->offset,
-		    failed_addr - entry->start, PQ_ACTIVE);
+		vm_object_unwire(map, entry, failed_addr - entry->start,
+		    PQ_ACTIVE);
 	}
 
 	/*
@@ -2839,8 +2839,7 @@ vm_map_entry_unwire(vm_map_t map, vm_map_entry_t entry)
 	KASSERT(entry->wired_count > 0,
 	    ("vm_map_entry_unwire: entry %p isn't wired", entry));
 	pmap_unwire(map->pmap, entry->start, entry->end);
-	vm_object_unwire(entry->object.vm_object, entry->offset, entry->end -
-	    entry->start, PQ_ACTIVE);
+	vm_object_unwire(map, entry, entry->end - entry->start, PQ_ACTIVE);
 	entry->wired_count = 0;
 }
 
diff --git a/sys/vm/vm_object.c b/sys/vm/vm_object.c
index c7f3153..79b233f 100644
--- a/sys/vm/vm_object.c
+++ b/sys/vm/vm_object.c
@@ -2220,18 +2220,23 @@ vm_object_set_writeable_dirty(vm_object_t object)
  *	wired.
  */
 void
-vm_object_unwire(vm_object_t object, vm_ooffset_t offset, vm_size_t length,
+vm_object_unwire(vm_map_t map, vm_map_entry_t entry, vm_size_t length,
     uint8_t queue)
 {
-	vm_object_t tobject;
+	vm_object_t object, tobject;
 	vm_page_t m, tm;
+	vm_ooffset_t offset;
 	vm_pindex_t end_pindex, pindex, tpindex;
 	int depth, locked_depth;
 
+	object = entry->object.vm_object;
+	offset = entry->offset;
 	KASSERT((offset & PAGE_MASK) == 0,
 	    ("vm_object_unwire: offset is not page aligned"));
 	KASSERT((length & PAGE_MASK) == 0,
 	    ("vm_object_unwire: length is not a multiple of PAGE_SIZE"));
+	KASSERT(length <= entry->end - entry->start,
+	    ("vm_object_unwire: length too large"));
 	/* The wired count of a fictitious page never changes. */
 	if ((object->flags & OBJ_FICTITIOUS) != 0)
 		return;
@@ -2254,9 +2259,8 @@ vm_object_unwire(vm_object_t object, vm_ooffset_t offset, vm_size_t length,
 				tpindex +=
 				    OFF_TO_IDX(tobject->backing_object_offset);
 				tobject = tobject->backing_object;
-				KASSERT(tobject != NULL,
-				    ("vm_object_unwire: missing page"));
-				if ((tobject->flags & OBJ_FICTITIOUS) != 0)
+				if (tobject == NULL ||
+				    (tobject->flags & OBJ_FICTITIOUS) != 0)
 					goto next_page;
 				depth++;
 				if (depth == locked_depth) {
@@ -2269,6 +2273,9 @@ vm_object_unwire(vm_object_t object, vm_ooffset_t offset, vm_size_t length,
 			tm = m;
 			m = TAILQ_NEXT(m, listq);
 		}
+		if (pmap_extract(map->pmap, entry->start +
+		    IDX_TO_OFF(pindex)) != VM_PAGE_TO_PHYS(tm))
+			goto next_page;
 		vm_page_lock(tm);
 		vm_page_unwire(tm, queue);
 		vm_page_unlock(tm);
diff --git a/sys/vm/vm_object.h b/sys/vm/vm_object.h
index 1f59156..9ac661d 100644
--- a/sys/vm/vm_object.h
+++ b/sys/vm/vm_object.h
@@ -320,7 +320,7 @@ void vm_object_shadow (vm_object_t *, vm_ooffset_t *, vm_size_t);
 void vm_object_split(vm_map_entry_t);
 boolean_t vm_object_sync(vm_object_t, vm_ooffset_t, vm_size_t, boolean_t,
     boolean_t);
-void vm_object_unwire(vm_object_t object, vm_ooffset_t offset,
+void vm_object_unwire(vm_map_t map, vm_map_entry_t entry,
     vm_size_t length, uint8_t queue);
 struct vnode *vm_object_vnode(vm_object_t object);
 #endif				/* _KERNEL */



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150620100116.GU2080>