Date: Mon, 10 Jun 1996 18:09:24 -0700 From: Matt Day <mday@sting.artisoft.com> To: freebsd-current@FreeBSD.ORG, taob@io.org Subject: Re: Kernel panic in fsync, 2.2-960501-SNAP Message-ID: <199606110109.SAA22935@sting.artisoft.com>
next in thread | raw e-mail | index | archive | help
Brian Tao <taob@io.org> wrote:
> Got this today (nm output follows):
>
> Fatal trap 12: page fault while in kernel mode
> fault virtual address = 0x18
> fault code = supervisor read, page not present
> instruction pointer = 0x8:0xf012afc3
> stack pointer = 0x10:0xefbfff2c
> frame pointer = 0x10:0xefbfff58
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, def32 1, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 6732 (emacs)
> interrupt mask =
> panic: page fault
>
> [..]
>
> # nm -a /kernel | sort | fgrep -C f012af
> f012adc0 T _ftruncate
> f012aef0 T _otruncate
> f012af20 T _oftruncate
> f012af50 T _fsync
> f012b020 T _rename
> f012b31c T _mkdir
It looks like your panic could very well have been caused by a bug
I reported several months ago. It has not been fixed yet in either
tree. Here is my original bug report:
> From mday Mon Feb 5 03:01:27 1996
> To: freebsd-bugs@freebsd.org, freebsd-hackers@freebsd.org
> Subject: Bad bug in ffs_sync() & friends
>
> Hi,
>
> I think there is a very rare, yet fatal, bug in ffs_sync() in the
> -CURRENT code (and the -STABLE code, and NetBSD 1.1, etc...).
> This bug has occured twice on my system in the past 6 months.
>
> Consider this scenario:
> ffs_vget() calls getnewvnode(), and then calls MALLOC() to allocate
> memory for the incore inode. That MALLOC() blocks.
> While that MALLOC() is blocked, ffs_sync() gets called. ffs_sync()
> finds the vnode just set up by that getnewvnode() on the mnt_vnodelist
> (because getnewvnode() put it there) and proceeds to dereference
> vp->v_data by calling VOP_ISLOCKED(), but v_data is still zero because
> that MALLOC() blocked.
>
> It looks like this bug is lurking in many other routines as well --
> pretty much any routine that runs down the mnt_vnodelist.
>
> What do you think? Please e-mail me directly, as I do not subscribe to
> these mailing lists.
>
> Thanks,
>
> Matt Day <mday@artisoft.com>
Here is one possible bug fix to the -CURRENT FFS code (the same
bug exists in some of the other file systems as well):
*** sys/ufs/ffs/ffs_vfsops.c- Sat Mar 2 20:43:40 1996
--- sys/ufs/ffs/ffs_vfsops.c Mon Jun 10 17:49:30 1996
***************
*** 866,871 ****
--- 866,881 ----
}
ffs_inode_hash_lock = 1;
+ /*
+ * N.B.: If this MALLOC() is performed after the getnewvnode()
+ * it might block, leaving a vnode with a NULL v_data to be
+ * found by ffs_sync() if a sync happens to fire right then,
+ * which will cause a panic because ffs_sync() blindly
+ * dereferences vp->v_data (as well it should).
+ */
+ type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */
+ MALLOC(ip, struct inode *, sizeof(struct inode), type, M_WAITOK);
+
/* Allocate a new vnode/inode. */
error = getnewvnode(VT_UFS, mp, ffs_vnodeop_p, &vp);
if (error) {
***************
*** 873,882 ****
wakeup(&ffs_inode_hash_lock);
ffs_inode_hash_lock = 0;
*vpp = NULL;
return (error);
}
- type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */
- MALLOC(ip, struct inode *, sizeof(struct inode), type, M_WAITOK);
bzero((caddr_t)ip, sizeof(struct inode));
vp->v_data = ip;
ip->i_vnode = vp;
--- 883,891 ----
wakeup(&ffs_inode_hash_lock);
ffs_inode_hash_lock = 0;
*vpp = NULL;
+ FREE(ip, type);
return (error);
}
bzero((caddr_t)ip, sizeof(struct inode));
vp->v_data = ip;
ip->i_vnode = vp;
Another way to fix the bug would be to check for vp->v_data == NULL
in ffs_sync(). But that way would not be very elegant, in my
opinion. I think a good, safe policy would be "if a vnode can be
found on the mnt_vnodelist list by a process, the process can assume
that the vnode is fully initialized".
Hope that helps,
Matt Day <mday@artisoft.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199606110109.SAA22935>
