Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Jul 2002 21:15:05 -0400
From:      Michael Adler <lists-ma@tapil.com>
To:        hackers@FreeBSD.ORG
Subject:   Bad vnode causing crash in 4.x
Message-ID:  <5.1.1.6.0.20020711210231.00ace8c0@shlc0003.shr.intel.com>

next in thread | raw e-mail | index | archive | help
I've been suffering infrequent system crashes when running ange-ftp under 
emacs for some time and finally have a crash dump from a kernel with 
symbols.  This crash dump was on 4.6-stable, though I've seen the bug off 
and on for at least a year.

All the crashes have the following characteristic:

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x10
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc01ae331
stack pointer           = 0x10:0xd7c9eed8
frame pointer           = 0x10:0xd7c9eedc
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 36349 (ftp)
interrupt mask          = none
trap number             = 12
panic: page fault

(kgdb) where
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
#1  0xc017fd2f in boot (howto=256)
     at /usr/src/sys/kern/kern_shutdown.c:316
#2  0xc0180154 in poweroff_wait (junk=0xc02f668c, howto=-1070636625)
     at /usr/src/sys/kern/kern_shutdown.c:595
#3  0xc02a3cce in trap_fatal (frame=0xd7c9ee98, eva=16)
     at /usr/src/sys/i386/i386/trap.c:966
#4  0xc02a39a1 in trap_pfault (frame=0xd7c9ee98, usermode=0, eva=16)
     at /usr/src/sys/i386/i386/trap.c:859
#5  0xc02a358b in trap (frame={tf_fs = -676790256, tf_es = -676790256,
       tf_ds = -674693104, tf_edi = -676731680, tf_esi = 1,
       tf_ebp = -674631972, tf_isp = -674631996, tf_ebx = 0,
       tf_edx = -674631932, tf_ecx = 47, tf_eax = -674575552,
       tf_trapno = 12, tf_err = 0, tf_eip = -1071979727, tf_cs = 8,
       tf_eflags = 66118, tf_esp = -1038362816, tf_ss = -674631960})
     at /usr/src/sys/i386/i386/trap.c:458
#6  0xc01ae331 in vop_revoke (ap=0xd7c9ef04)
     at /usr/src/sys/kern/vfs_subr.c:1965
#7  0xc01aace9 in vop_defaultop (ap=0xd7c9ef04)
     at /usr/src/sys/kern/vfs_default.c:150
#8  0xc0178381 in exit1 (p=0xd7a9e4e0, rv=0) at vnode_if.h:500
#9  0xc01780e4 in exit1 (p=0xd7a9e4e0, rv=0)
     at /usr/src/sys/kern/kern_exit.c:103
#10 0xc02a3f7d in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
       tf_edi = 0, tf_esi = -1, tf_ebp = -1077939936,
       tf_isp = -674631724, tf_ebx = 672471396, tf_edx = 672470976,
       tf_ecx = 1, tf_eax = 1, tf_trapno = 7, tf_err = 2,
       tf_eip = 672154536, tf_cs = 31, tf_eflags = 647,
       tf_esp = -1077939980, tf_ss = 47})
     at /usr/src/sys/i386/i386/trap.c:1167
#11 0xc0297f05 in Xint0x80_syscall ()
Cannot access memory at address 0xbfbff120.


The final problem before the crash is a reference to page 0 in vop_revoke 
because dev is 0.  The vop_revoke_args struct (ap) appears to be filled in, 
but v_type is VBAD and a_vp->v_un.vu_spec.vu_specinfo (which is assigned to 
dev in vop_revoke) is 0.

Here is the whole data structure:

(kgdb) p *((struct vop_revoke_args *) 0xd7c9ef04)->a_desc
$1 = {vdesc_offset = 47, vdesc_name = 0xc02bff86 "vop_revoke",
   vdesc_flags = 0, vdesc_vp_offsets = 0xc0300664,
   vdesc_vpp_offset = -1, vdesc_cred_offset = -1,
   vdesc_proc_offset = -1, vdesc_componentname_offset = -1,
   vdesc_transports = 0x0}
(kgdb) p *((struct vop_revoke_args *) 0xd7c9ef04)->a_vp
$2 = {v_flag = 8, v_usecount = 1, v_writecount = 0, v_holdcnt = 0,
   v_id = 18538, v_mount = 0x0, v_op = 0xc1de6500, v_freelist = {
     tqe_next = 0x0, tqe_prev = 0xd660a29c}, v_nmntvnodes = {
     tqe_next = 0x0, tqe_prev = 0xd7c59824}, v_cleanblkhd = {
     tqh_first = 0x0, tqh_last = 0xd7cacb6c}, v_dirtyblkhd = {
     tqh_first = 0x0, tqh_last = 0xd7cacb74}, v_synclist = {
     le_next = 0x0, le_prev = 0x0}, v_numoutput = 0, v_type = VBAD,
   v_un = {vu_mountedhere = 0x0, vu_socket = 0x0, vu_spec = {
       vu_specinfo = 0x0, vu_specnext = {sle_next = 0x0}},
     vu_fifoinfo = 0x0}, v_lease = 0x0, v_lastw = 0, v_cstart = 0,
   v_lasta = 0, v_clen = 0, v_object = 0x0, v_interlock = {
     lock_data = 0}, v_vnlock = 0x0, v_tag = VT_NON, v_data = 0x0,
   v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0,
     tqh_last = 0xd7cacbc0}, v_dd = 0xd7cacb40, v_ddid = 0,
   v_pollinfo = {vpi_lock = {lock_data = 0}, vpi_selinfo = {si_pid = 0,
       si_note = {slh_first = 0x0}, si_flags = 0}, vpi_events = 0,
     vpi_revents = 0}, v_vxproc = 0x0}


Any suggestions?  This seems to be triggered when ange-ftp mode in emacs is 
left sitting for hours without either it or emacs running.  I assume exit() 
is called for the ftp process because the remote side hung up.  Just having 
the remote side hang up isn't enough to trigger it though.  I often have 
the remote side hang up after a few minutes and ange-ftp reconnects.  This 
seems to happen only after the process sits around.  The machine is 
relatively idle, too.  The probability that any swap is involved is quite 
low.

-Michael


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5.1.1.6.0.20020711210231.00ace8c0>