Date: Tue, 4 Mar 1997 16:02:00 +1030 (CST) From: Michael Smith <msmith@atrad.adelaide.edu.au> To: jlemon@americantv.com (Jonathan Lemon) Cc: msmith@atrad.adelaide.edu.au, proff@iq.org, hackers@FreeBSD.ORG Subject: Re: xemacs crashes kernel Message-ID: <199703040532.QAA10831@genesis.atrad.adelaide.edu.au> In-Reply-To: <19970303230157.25741@right.PCS> from Jonathan Lemon at "Mar 3, 97 11:01:57 pm"
index | next in thread | previous in thread | raw e-mail
Jonathan Lemon stands accused of saying:
> On Mar 03, 1997 at 03:11:23PM +1030, Michael Smith wrote:
> > Jonathan Lemon stands accused of saying:
> > > On Mar 03, 1997 at 01:03:08PM +1100, Julian Assange wrote:
> > > >
> > > > (1) telnet into machine
> > > > (2) start up xemacs in text mode
> > > > (3) suspend xemacs
> > > > (4) remote-disconnect telnet
> > >
> > > Bleah. Confirmed here, on a 2.2-GAMMA machine. Doing this causes
> > > a "Trap 12, code 0 - page fault in kernel mode".
> >
> > Can you give us the trap message and do the nm /kernel | less thing?
>
> Panic dump (typed by hand):
>
> Fatal trap 12: page fault while in kernel mode
> fault virtual address = 0x18
> fault code = supervisor read, page not present
Looks like a read dereference of a null structure pointer.
> instruction pointer = 0x8:0xf013753b
> stack pointer = 0x10:0x3fbfff18
> frame pointer = 0x10:0x3fbfff44
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, def32 1, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> interrupt mask =
> kernel: type 12 trap, code 0
>
> stopped at _fsync+0x73, testb $0x40, 0x18(%eax)
>
> nm /kernel | grep f0137 | sort
>
> f01374c8 T _fsync
Ok. Here it is :
int
fsync(p, uap, retval)
struct proc *p;
struct fsync_args *uap;
int *retval;
{
register struct vnode *vp;
struct file *fp;
int error;
error = getvnode(p->p_fd, uap->fd, &fp);
if (error)
return (error);
vp = (struct vnode *)fp->f_data;
VOP_LOCK(vp);
if (vp->v_object) {
vm_object_page_clean(vp->v_object, 0, 0 ,0, FALSE);
}
error = VOP_FSYNC(vp, fp->f_cred,
(vp->v_mount->mnt_flag & MNT_ASYNC) ? MNT_NOWAIT : MNT_WAIT, p);
MNT_ASYNC is 0x40, and mnt_flag looks to be about 0x18 offset in the
mount structure. Looks like maybe someone trying to fsync something
that's not a file, although a quick test here doesn't indicate that.
Are non-file items supposed to have valid v_mount pointers? Other places
in the kernel that look at vp->v_mount often check it against zero first;
should that be done here, eg.
(vp->v_mount && (vp->v_mount->mnt_flag & MNT_ASYNC)) ? MNT_NOWAIT...
as well? This looks like it might have been overlooked when the async
filesystem stuff came in, as old versions of this code read :
error = VOP_FSYNC(vp, fp->f_cred, MNT_WAIT, p);
Suggestions? Jonathan, can you try the above and see if it cures your
problem?
--
]] Mike Smith, Software Engineer msmith@gsoft.com.au [[
]] Genesis Software genesis@gsoft.com.au [[
]] High-speed data acquisition and (GSM mobile) 0411-222-496 [[
]] realtime instrument control. (ph) +61-8-8267-3493 [[
]] Unix hardware collector. "Where are your PEZ?" The Tick [[
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199703040532.QAA10831>
