Date: Mon, 10 Jun 1996 18:09:24 -0700 From: Matt Day <mday@sting.artisoft.com> To: freebsd-current@FreeBSD.ORG, taob@io.org Subject: Re: Kernel panic in fsync, 2.2-960501-SNAP Message-ID: <199606110109.SAA22935@sting.artisoft.com>
next in thread | raw e-mail | index | archive | help
Brian Tao <taob@io.org> wrote: > Got this today (nm output follows): > > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x18 > fault code = supervisor read, page not present > instruction pointer = 0x8:0xf012afc3 > stack pointer = 0x10:0xefbfff2c > frame pointer = 0x10:0xefbfff58 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 6732 (emacs) > interrupt mask = > panic: page fault > > [..] > > # nm -a /kernel | sort | fgrep -C f012af > f012adc0 T _ftruncate > f012aef0 T _otruncate > f012af20 T _oftruncate > f012af50 T _fsync > f012b020 T _rename > f012b31c T _mkdir It looks like your panic could very well have been caused by a bug I reported several months ago. It has not been fixed yet in either tree. Here is my original bug report: > From mday Mon Feb 5 03:01:27 1996 > To: freebsd-bugs@freebsd.org, freebsd-hackers@freebsd.org > Subject: Bad bug in ffs_sync() & friends > > Hi, > > I think there is a very rare, yet fatal, bug in ffs_sync() in the > -CURRENT code (and the -STABLE code, and NetBSD 1.1, etc...). > This bug has occured twice on my system in the past 6 months. > > Consider this scenario: > ffs_vget() calls getnewvnode(), and then calls MALLOC() to allocate > memory for the incore inode. That MALLOC() blocks. > While that MALLOC() is blocked, ffs_sync() gets called. ffs_sync() > finds the vnode just set up by that getnewvnode() on the mnt_vnodelist > (because getnewvnode() put it there) and proceeds to dereference > vp->v_data by calling VOP_ISLOCKED(), but v_data is still zero because > that MALLOC() blocked. > > It looks like this bug is lurking in many other routines as well -- > pretty much any routine that runs down the mnt_vnodelist. > > What do you think? Please e-mail me directly, as I do not subscribe to > these mailing lists. > > Thanks, > > Matt Day <mday@artisoft.com> Here is one possible bug fix to the -CURRENT FFS code (the same bug exists in some of the other file systems as well): *** sys/ufs/ffs/ffs_vfsops.c- Sat Mar 2 20:43:40 1996 --- sys/ufs/ffs/ffs_vfsops.c Mon Jun 10 17:49:30 1996 *************** *** 866,871 **** --- 866,881 ---- } ffs_inode_hash_lock = 1; + /* + * N.B.: If this MALLOC() is performed after the getnewvnode() + * it might block, leaving a vnode with a NULL v_data to be + * found by ffs_sync() if a sync happens to fire right then, + * which will cause a panic because ffs_sync() blindly + * dereferences vp->v_data (as well it should). + */ + type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */ + MALLOC(ip, struct inode *, sizeof(struct inode), type, M_WAITOK); + /* Allocate a new vnode/inode. */ error = getnewvnode(VT_UFS, mp, ffs_vnodeop_p, &vp); if (error) { *************** *** 873,882 **** wakeup(&ffs_inode_hash_lock); ffs_inode_hash_lock = 0; *vpp = NULL; return (error); } - type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */ - MALLOC(ip, struct inode *, sizeof(struct inode), type, M_WAITOK); bzero((caddr_t)ip, sizeof(struct inode)); vp->v_data = ip; ip->i_vnode = vp; --- 883,891 ---- wakeup(&ffs_inode_hash_lock); ffs_inode_hash_lock = 0; *vpp = NULL; + FREE(ip, type); return (error); } bzero((caddr_t)ip, sizeof(struct inode)); vp->v_data = ip; ip->i_vnode = vp; Another way to fix the bug would be to check for vp->v_data == NULL in ffs_sync(). But that way would not be very elegant, in my opinion. I think a good, safe policy would be "if a vnode can be found on the mnt_vnodelist list by a process, the process can assume that the vnode is fully initialized". Hope that helps, Matt Day <mday@artisoft.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199606110109.SAA22935>