Date: Wed, 26 Sep 2001 20:43:33 -0500 From: Douglas Swarin <doug@staff.texas.net> To: hackers@freebsd.org Cc: dillon@earth.backplane.com, tmoestl@gmx.net, bp@freebsd.org Subject: Re: more on Re: Please review: bugfix for vinvalbuf() Message-ID: <20010926204333.A15865@staff.texas.net> In-Reply-To: <200107110643.f6B6hTB24707@earth.backplane.com>; from dillon@earth.backplane.com on Tue, Jul 10, 2001 at 11:43:29PM -0700 References: <20010711003926.B8799@crow.dom2ip.de> <200107110643.f6B6hTB24707@earth.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jul 10, 2001 at 11:43:29PM -0700, Matt Dillon wrote: > > :Hi, > : > :I've just tripped over an obviously long-standing (since about > :Jan. 1998) bug in vinvalbuf while looking into PR kern/26224. The > :problematic code looks like (on -CURRENT): > : > : /* > : * Destroy the copy in the VM cache, too. > : */ > : mtx_lock(&vp->v_interlock); > : if (VOP_GETVOBJECT(vp, &object) == 0) { > : vm_object_page_remove(object, 0, 0, > : (flags & V_SAVE) ? TRUE : FALSE); > : } > : mtx_unlock(&vp->v_interlock); > : > :The locks seem to be needed for file systems that don't perform real > :locking (on -STABLE, they are simplelocks). > :This, however, is incorrect because vm_object_page_remove may sleep. > :I've attached a patch that passes the interlock to > :vm_object_page_remove, which in turn passes it to a modified version > :of vm_page_sleep, which unlocks it around the sleep. > :I think that this is correct, because the object should be in a valid > :state when we sleep (and all checks are reexecuted in that case). > : > :Since I'm not very experienced with vfs and vm stuff, I'd be glad if > :this patch could get some review. In particular, is the lock/unlock > :pair really still needed, and are there notable differeces in -STABLE > :(because the fix would need the be MFC'ed)? > : > :Thanks, > : - thomas > > Ok, I've looked at this more. What is supposed to happen is that > the 'while (vp->v_numoutput) ...' code just above the section you > cited is supposed to prevent the deadlock. The while code looks like > this: > > while (vp->v_numoutput > 0) { > vp->v_flag |= VBWAIT; > tsleep(&vp->v_numoutput, PVM, "vnvlbv", 0); > } > > However, as Ian points out in his followup, it doesn't work: > > :... > :I've seen a related deadlock here in 4.x with NFS; vm_fault locks > :a page and then calls vput which aquires the v_interlock. This code > :in vinvalbuf locks the v_interlock first, and then vm_object_page_remove() > :locks the page. That's a simple lock-order reversal deadlock which I > :guess would still exist even with this patch. > > It doesn't work for the simple reason that vm_fault isn't busying the > page for writing, it's busying it for reading, so v_numoutput will be > 0. > > I think the best solution is to change vinvalbuf's wait loop to > wait on vm_object->paging_in_progress instead of vp->v_numoutput, > or perhaps to wait for both conditions to be satisfied. > > vm_object->paging_in_progress applies to reads and writes, while > vp->v_numoutput only applies to writes. > > I know this isn't the most correct solution... there is still the > lock reversal if vm_object_remove_pages() were ever to sleep. The > idea is that it won't sleep if there is no paging in progress because > nobody will have any of the object's pages locked. I think it is > the best we can do for the moment. There are several places in > vm/vm_object.c where the same assumption is made (testing against > vm_object->paging_in_progress, though, which works, not vp->v_numoutput). > > - > > Thomas, if you are interested this could be a little project for you. > See if you can code-up another while() loop to also wait for > the object's paging_in_progress count to become 0 in the vinvalbuf() > code. Look in vm/vm_object.c for examples of how to construct such > a loop. I will be happy to review and test whatever you come up with > and I'm sure Ian will too! > > -Matt > > I recently mentioned on freebsd-stable in message <20010923124824.A16392@staff.texas.net> a recurring rslock panic which I believe has been caused by the below mentioned problem in vinvalbuf(). I have worked up a patch for -STABLE (which should also apply to -CURRENT if there have not been major changes to this code). The patch is appended to this message for review. Data from the crash dump revealed the following: SMP 2 cpus IdlePTD 3555328 initial pcb at 2cf280 panicstr: rslock: cpu: 0, addr: 0xd7be66ec, lock: 0x00000001 panic messages: --- panic: rslock: cpu: 0, addr: 0xd7be66ec, lock: 0x00000001 mp_lock = 00000001; cpuid = 0; lapic.id = 01000000 boot() called on cpu#0 --- #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:473 #1 0xc016cf8f in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:313 #2 0xc016d3a9 in panic (fmt=0xc025bcce "rslock: cpu: %d, addr: 0x%08x, lock: 0x%08x") at /usr/src/sys/kern/kern_shutdown.c:581 #3 0xc025bcce in bsl1 () #4 0xc021eeac in _unlock_things (fs=0xd7a6dec4, dealloc=0) at /usr/src/sys/vm/vm_fault.c:147 #5 0xc021f8c7 in vm_fault (map=0xd7a6bf40, vaddr=673968128, fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:826 #6 0xc025d016 in trap_pfault (frame=0xd7a6dfa8, usermode=1, eva=673972223) at /usr/src/sys/i386/i386/trap.c:829 #7 0xc025ca8b in trap (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 99049, tf_esi = 0, tf_ebp = -1077937884, tf_isp = -676929580, tf_ebx = 48110729, tf_edx = 0, tf_ecx = 1835007, tf_eax = 672137216, tf_trapno = 12, tf_err = 4, tf_eip = 134517190, tf_cs = 31, tf_eflags = 66070, tf_esp = -1077937940, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:359 #8 0x80491c6 in ?? () #9 0x8048d9e in ?? () #10 0x804a078 in ?? () #11 0x8048b45 in ?? () --- A quick review of processes revealed a process stuck in vmopar: (kgdb) ps ... 46479 d7ffc560 d806e000 235886 1 46394 004006 3 tail vmopar c09120c8 ... which was sleeping in vm_object_page_remove() in vinvalbuf(): (kgdb) btp 46479 frame 0 at 0xd806fc8c: ebp d806fcb8, eip 0xc0170051 <tsleep+429>: mov 0x141(%ebx),%al frame 1 at 0xd806fcb8: ebp d806fce0, eip 0xc02262e8 <vm_object_page_remove+268>: add $0x10,%esp frame 2 at 0xd806fce0: ebp d806fd2c, eip 0xc019a667 <vinvalbuf+803>: add $0x10,%esp frame 3 at 0xd806fd2c: ebp d806fd60, eip 0xc01d0aa0 <nfs_vinvalbuf+264>: add $0x18,%esp frame 4 at 0xd806fd60: ebp d806fe2c, eip 0xc01cf5d8 <nfs_bioread+540>: mov %eax,0xffffff74(%ebp) frame 5 at 0xd806fe2c: ebp d806fe44, eip 0xc01f6842 <nfs_read+30>: jmp 0xc01f6849 <nfs_read+37> frame 6 at 0xd806fe44: ebp d806fe78, eip 0xc01a22cc <vn_read+280>: mov %eax,0xffffffe8(%ebp) frame 7 at 0xd806fe78: ebp d806fef4, eip 0xc017b690 <dofileread+176>: mov %eax,%esi frame 8 at 0xd806fef4: ebp d806ff28, eip 0xc017b556 <read+54>: mov %eax,%esi frame 9 at 0xd806ff28: ebp d806ffa0, eip 0xc025d745 <syscall2+537>: mov %eax,0xffffffb8(%ebp) The patch is below, against vfs_subr.c 1.249.2.11 2001/09/11 --- vfs_subr.c Tue Sep 11 04:49:53 2001 +++ vfs_subr.c.new Wed Sep 26 15:23:09 2001 @@ -721,9 +721,9 @@ } } - while (vp->v_numoutput > 0) { - vp->v_flag |= VBWAIT; - tsleep(&vp->v_numoutput, PVM, "vnvlbv", 0); + if (VOP_GETVOBJECT(vp, &object) == 0) { + while (object->paging_in_progress) + vm_object_pip_sleep(object, "vnvlbv"); } splx(s); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010926204333.A15865>