Date: Sun, 16 Aug 1998 01:13:33 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: dg@root.com Cc: current@FreeBSD.ORG, karl@mcs.net Subject: Better VM patches (was Tentative fix for VM bug) Message-ID: <199808160113.SAA17303@usr07.primenet.com> In-Reply-To: <199808152349.QAA20449@implode.root.com> from "David Greenman" at Aug 15, 98 04:49:26 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> >yes, but is it a bug that he's reporting? > > I don't see anything wrong with vm_object_page_remove(). We'd have > far more serious problems if there was something wrong with it (the system > wouldn't last for more than a few seconds, if that), and further, the > problem wouldn't be specific to just NFS. My thinking at the moment is that > the problem is caused by the lack of NFSnode locking; we depend on VOP_LOCK > actually doing something in many parts of the code, and with nfs_lock() > being a noop, it just doesn't happen. There might be a way to fix this > without messing with nfs_lock(), but the exact failure scenario will need > to be looked at carefully before this can be determined. Let me prefix this by saying that your NFS fixes are correct, and that my "backup patch" apparently only masks the EOF bug that results from bogus invalidations, and there's a better fix. The problem with vm_object_page_remove() is that in the clean_only != 0 case, the page is marked invalid. This will result in orphaned pages. This is not the problem we have been seeing, because these would be lost. This may be the source of the: * XXX: sometimes we get pages which aren't wired down or on any queue - * we need to put them on the inactive queue also, otherwise we lose * track of them. Paul Mackerras (paulus@cs.anu.edu.au) 9-Jan-93. comment in vm_page.c, and it may be the source of the zero page phenomenon. John provided the fix, but inverted the clearing of valid and the vm_page_protect call. I corrected this. Another problem in the mmap code is that when the vnode_pager object is created, it is set to be the number of pages times the page size in length; but also in vnode_pager, the setsize on the object sets the length to the actual file length. This discrepancy opens a window in which some changes may be lost. I have narrowed this window, but it's still there; other than adding an actual size parameter to vnode_pager_object_allocate, I couldn't fully close off the window. I would appreciate it if people with mmap news server problems and the zero'ed page problems would try these patches. John also suggested another change for detecting a case where page orphans can be created. in vm_page.c, it's possible that an object entry in the page insert function will overwrite an existing entry; I added a DIAGNOSTIC panic to catch this when it happens. Finally, there were three cases in which a lock was being acquired, but not being logged by MAP_LOCK_DIAGNOSTIC. I've added a vm_map_lock_try macro calling a _vm_map_lock_try inline similar to others in vm_map.h to track this. There are now now cases of lockmgr() being called directly, yet the lock released by a macro encapsulation. Patches (against Whistle's working copy of -current) follow my signature.... Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. ============================================================================= *** vm_fault.c Mon May 4 03:01:43 1998 --- vm_fault.c.new Sat Aug 15 22:45:59 1998 *************** *** 590,597 **** */ (fs.lookup_still_valid || (((fs.entry->eflags & MAP_ENTRY_IS_A_MAP) == 0) && ! lockmgr(&fs.map->lock, ! LK_EXCLUSIVE|LK_NOWAIT, (void *)0, curproc) == 0))) { fs.lookup_still_valid = 1; /* --- 590,596 ---- */ (fs.lookup_still_valid || (((fs.entry->eflags & MAP_ENTRY_IS_A_MAP) == 0) && ! vm_map_lock_try(fs.map) == 0))) { fs.lookup_still_valid = 1; /* *** vm_glue.c Wed Mar 4 10:27:00 1998 --- vm_glue.c.new Sat Aug 15 22:47:13 1998 *************** *** 452,460 **** * do not swapout a process that is waiting for VM * data structures there is a possible deadlock. */ ! if (lockmgr(&vm->vm_map.lock, ! LK_EXCLUSIVE | LK_NOWAIT, ! (void *)0, curproc)) { vmspace_free(vm); continue; } --- 452,459 ---- * do not swapout a process that is waiting for VM * data structures there is a possible deadlock. */ ! if (vm_map_lock_try(&vm->vm_map)) { ! /* failed to obtain lock*/ vmspace_free(vm); continue; } *** vm_map.h Thu Jan 22 17:30:38 1998 --- vm_map.h.new Sat Aug 15 23:22:14 1998 *************** *** 242,247 **** --- 242,261 ---- #endif static __inline__ int + _vm_map_lock_try(vm_map_t map, struct proc *p) { + int rv; + + rv = lockmgr(&(map)->lock, LK_EXCLUSIVE | LK_NOWAIT, (void *)0, p); + #if defined(MAP_LOCK_DIAGNOSTIC) + if( !rv) + printf("locking map LK_EXCLUSIVE: 0x%x\n", map); + #endif + return rv; + } + + #define vm_map_lock_try(map) _vm_map_lock_try(map, curproc) + + static __inline__ int _vm_map_lock_upgrade(vm_map_t map, struct proc *p) { #if defined(MAP_LOCK_DIAGNOSTIC) printf("locking map LK_EXCLUPGRADE: 0x%x\n", map); *** vm_mmap.c Sun Aug 16 00:09:32 1998 --- vm_mmap.c.new Sun Aug 16 00:33:06 1998 *************** *** 929,934 **** --- 929,935 ---- vm_ooffset_t objsize; int docow; struct proc *p = curproc; + struct vattr vat; if (size == 0) return (0); *************** *** 972,978 **** type = OBJT_DEVICE; handle = (void *)vp->v_rdev; } else { - struct vattr vat; int error; error = VOP_GETATTR(vp, &vat, p->p_ucred, p); --- 973,978 ---- *************** *** 990,995 **** --- 990,1002 ---- handle, OFF_TO_IDX(objsize), prot, foff); if (object == NULL) return (type == OBJT_DEVICE ? EINVAL : ENOMEM); + /* + * XXX we are in a race here. If the file is extended + * between the time we get the object and the time + * we set the size, then we lose... + */ + if (!(flags & MAP_ANON) && vp->v_type != VCHR) + object->un_pager.vnp.vnp_size = vat.va_size; } /* *** vm_page.c Sat Aug 15 20:26:27 1998 --- vm_page.c.new Sat Aug 15 22:10:23 1998 *************** *** 388,393 **** --- 388,400 ---- register vm_pindex_t pindex; { register struct pglist *bucket; + #ifdef DIAGNOSTIC + register vm_page_t dm; + + dm = vm_page_lookup(object, pindex); + if (dm) + panic("vm_page_insert: insertion overwrites existing entry"); + #endif #if !defined(MAX_PERF) if (m->flags & PG_TABLED) *************** *** 1071,1076 **** --- 1078,1085 ---- /* * helper routine for vm_page_free and vm_page_free_zero + * + * must be at splhigh */ static int vm_page_freechk_and_unqueue(m) *** vm_pageout.c Mon Mar 30 09:56:58 1998 --- vm_pageout.c.new Sat Aug 15 22:45:37 1998 *************** *** 530,536 **** vm_map_entry_t tmpe; vm_object_t obj, bigobj; ! if (lockmgr(&map->lock, LK_EXCLUSIVE | LK_NOWAIT, (void *)0, curproc)) { return; } --- 530,537 ---- vm_map_entry_t tmpe; vm_object_t obj, bigobj; ! if (vm_map_lock_try(map)) { ! /* failed to obtain lock */ return; } *** vnode_pager.c Mon Mar 16 01:56:03 1998 --- vnode_pager.c.new Sun Aug 16 00:05:12 1998 *************** *** 137,142 **** --- 137,147 ---- object = vm_object_allocate(OBJT_VNODE, size); object->flags = 0; + /* + * The object is allocated to a page boundary. This is + * incorrect if page is only partially backed by the vp; + * we must fix this up in the caller. + */ object->un_pager.vnp.vnp_size = (vm_ooffset_t) size * PAGE_SIZE; object->handle = handle; *** vm_object.c Sun Aug 16 00:55:33 1998 --- vm_object.c.new Sun Aug 16 00:54:15 1998 *************** *** 1324,1330 **** if (all || ((start <= p->pindex) && (p->pindex < end))) { if (p->wire_count != 0) { vm_page_protect(p, VM_PROT_NONE); ! p->valid = 0; continue; } --- 1324,1331 ---- if (all || ((start <= p->pindex) && (p->pindex < end))) { if (p->wire_count != 0) { vm_page_protect(p, VM_PROT_NONE); ! if (!clean_only) ! p->valid = 0; continue; } *************** *** 1352,1359 **** if ((p = vm_page_lookup(object, start)) != 0) { if (p->wire_count != 0) { - p->valid = 0; vm_page_protect(p, VM_PROT_NONE); start += 1; size -= 1; continue; --- 1353,1361 ---- if ((p = vm_page_lookup(object, start)) != 0) { if (p->wire_count != 0) { vm_page_protect(p, VM_PROT_NONE); + if (!clean_only) + p->valid = 0; start += 1; size -= 1; continue; ============================================================================= To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199808160113.SAA17303>