From owner-freebsd-current Sun Sep 1 0:26:27 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 22D0C37B400; Sun, 1 Sep 2002 00:26:24 -0700 (PDT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6BE3B43E6A; Sun, 1 Sep 2002 00:26:23 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Received: from mousie.catspoiler.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.5/8.12.5) with ESMTP id g817QEwr067908; Sun, 1 Sep 2002 00:26:18 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Message-Id: <200209010726.g817QEwr067908@gw.catspoiler.org> Date: Sun, 1 Sep 2002 00:26:14 -0700 (PDT) From: Don Lewis Subject: Re: Page faults from bento cluster (Re: Problems reading vmcores) To: kris@obsecurity.org Cc: current@FreeBSD.ORG, phk@FreeBSD.ORG In-Reply-To: <20020901035300.GA9547@xor.obsecurity.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 31 Aug, Kris Kennaway wrote: > Another page fault in umount I haven't seen any reports of this one before. > #6 0xc0399a48 in calltrap () at {standard input}:98 > #7 0xc029198d in vflush (mp=0xc5e60000, rootrefs=0, flags=2) at vnode_if.h:309 > #8 0xc0200eaa in devfs_unmount (mp=0xc5e60000, mntflags=524288, td=0xc5855000) > at /usr/src/sys/fs/devfs/devfs_vfsops.c:130 > #9 0xc028d9b4 in dounmount (mp=0xc5e60000, flags=-974782464, td=0xc5855000) > at /usr/src/sys/kern/vfs_mount.c:1296 > #10 0xc028d79c in unmount (td=0xc5855000, uap=0xda021d10) > at /usr/src/sys/kern/vfs_mount.c:1239 > #11 0xc03a8a31 in syscall (frame= > {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 134845070, tf_esi = 134950973, tf_ebp = -1077938936, tf_isp = -637395596, tf_ebx = 0, tf_edx = 1, tf_ecx = 3, tf_eax = 22, tf_trapno = 12, tf_err = 2, tf_eip = 134524579, tf_cs = 31, tf_eflags = 514, tf_esp = -1077939060, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1050 > #12 0xc0399a9d in Xint0x80_syscall () at {standard input}:140 > ---Can't read userspace from dump, or kernel process--- This code in vflush() bothers me: mtx_lock(&mntvnode_mtx); loop: for (vp = TAILQ_FIRST(&mp->mnt_nvnodelist); vp; vp = nvp) { /* * Make sure this vnode wasn't reclaimed in getnewvnode(). * Start over if it has (it won't be on the list anymore). */ if (vp->v_mount != mp) goto loop; nvp = TAILQ_NEXT(vp, v_nmntvnodes); mtx_unlock(&mntvnode_mtx); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td); /* * Skip over a vnodes marked VV_SYSTEM. */ if ((flags & SKIPSYSTEM) && (vp->v_vflag & VV_SYSTEM)) { VOP_UNLOCK(vp, 0, td); mtx_lock(&mntvnode_mtx); continue; } /* * If WRITECLOSE is set, flush out unlinked but still open * files (even if open only for reading) and regular file * vnodes open for writing. */ error = VOP_GETATTR(vp, &vattr, td->td_ucred, td); VI_LOCK(vp); As near as I can tell the panic is happening in VOP_GETATTR(). It looks to me like it would be possible for the vnode to be recycled between the time when it passes the vp->v_mount test at the top of the loop and the time when vn_lock() succeeds. Shouldn't we bump the vnode reference count by calling vref() at the top of the loop and add the appropriate calls to vrele()? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message