Date: Sun, 1 Sep 2002 00:26:14 -0700 (PDT) From: Don Lewis <dl-freebsd@catspoiler.org> To: kris@obsecurity.org Cc: current@FreeBSD.ORG, phk@FreeBSD.ORG Subject: Re: Page faults from bento cluster (Re: Problems reading vmcores) Message-ID: <200209010726.g817QEwr067908@gw.catspoiler.org> In-Reply-To: <20020901035300.GA9547@xor.obsecurity.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 31 Aug, Kris Kennaway wrote: > Another page fault in umount I haven't seen any reports of this one before. > #6 0xc0399a48 in calltrap () at {standard input}:98 > #7 0xc029198d in vflush (mp=0xc5e60000, rootrefs=0, flags=2) at vnode_if.h:309 > #8 0xc0200eaa in devfs_unmount (mp=0xc5e60000, mntflags=524288, td=0xc5855000) > at /usr/src/sys/fs/devfs/devfs_vfsops.c:130 > #9 0xc028d9b4 in dounmount (mp=0xc5e60000, flags=-974782464, td=0xc5855000) > at /usr/src/sys/kern/vfs_mount.c:1296 > #10 0xc028d79c in unmount (td=0xc5855000, uap=0xda021d10) > at /usr/src/sys/kern/vfs_mount.c:1239 > #11 0xc03a8a31 in syscall (frame= > {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 134845070, tf_esi = 134950973, tf_ebp = -1077938936, tf_isp = -637395596, tf_ebx = 0, tf_edx = 1, tf_ecx = 3, tf_eax = 22, tf_trapno = 12, tf_err = 2, tf_eip = 134524579, tf_cs = 31, tf_eflags = 514, tf_esp = -1077939060, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1050 > #12 0xc0399a9d in Xint0x80_syscall () at {standard input}:140 > ---Can't read userspace from dump, or kernel process--- This code in vflush() bothers me: mtx_lock(&mntvnode_mtx); loop: for (vp = TAILQ_FIRST(&mp->mnt_nvnodelist); vp; vp = nvp) { /* * Make sure this vnode wasn't reclaimed in getnewvnode(). * Start over if it has (it won't be on the list anymore). */ if (vp->v_mount != mp) goto loop; nvp = TAILQ_NEXT(vp, v_nmntvnodes); mtx_unlock(&mntvnode_mtx); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td); /* * Skip over a vnodes marked VV_SYSTEM. */ if ((flags & SKIPSYSTEM) && (vp->v_vflag & VV_SYSTEM)) { VOP_UNLOCK(vp, 0, td); mtx_lock(&mntvnode_mtx); continue; } /* * If WRITECLOSE is set, flush out unlinked but still open * files (even if open only for reading) and regular file * vnodes open for writing. */ error = VOP_GETATTR(vp, &vattr, td->td_ucred, td); VI_LOCK(vp); As near as I can tell the panic is happening in VOP_GETATTR(). It looks to me like it would be possible for the vnode to be recycled between the time when it passes the vp->v_mount test at the top of the loop and the time when vn_lock() succeeds. Shouldn't we bump the vnode reference count by calling vref() at the top of the loop and add the appropriate calls to vrele()? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200209010726.g817QEwr067908>