Date: Thu, 11 Jul 2002 21:15:05 -0400 From: Michael Adler <lists-ma@tapil.com> To: hackers@FreeBSD.ORG Subject: Bad vnode causing crash in 4.x Message-ID: <5.1.1.6.0.20020711210231.00ace8c0@shlc0003.shr.intel.com>
next in thread | raw e-mail | index | archive | help
I've been suffering infrequent system crashes when running ange-ftp under emacs for some time and finally have a crash dump from a kernel with symbols. This crash dump was on 4.6-stable, though I've seen the bug off and on for at least a year. All the crashes have the following characteristic: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x10 fault code = supervisor read, page not present instruction pointer = 0x8:0xc01ae331 stack pointer = 0x10:0xd7c9eed8 frame pointer = 0x10:0xd7c9eedc code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 36349 (ftp) interrupt mask = none trap number = 12 panic: page fault (kgdb) where #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487 #1 0xc017fd2f in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:316 #2 0xc0180154 in poweroff_wait (junk=0xc02f668c, howto=-1070636625) at /usr/src/sys/kern/kern_shutdown.c:595 #3 0xc02a3cce in trap_fatal (frame=0xd7c9ee98, eva=16) at /usr/src/sys/i386/i386/trap.c:966 #4 0xc02a39a1 in trap_pfault (frame=0xd7c9ee98, usermode=0, eva=16) at /usr/src/sys/i386/i386/trap.c:859 #5 0xc02a358b in trap (frame={tf_fs = -676790256, tf_es = -676790256, tf_ds = -674693104, tf_edi = -676731680, tf_esi = 1, tf_ebp = -674631972, tf_isp = -674631996, tf_ebx = 0, tf_edx = -674631932, tf_ecx = 47, tf_eax = -674575552, tf_trapno = 12, tf_err = 0, tf_eip = -1071979727, tf_cs = 8, tf_eflags = 66118, tf_esp = -1038362816, tf_ss = -674631960}) at /usr/src/sys/i386/i386/trap.c:458 #6 0xc01ae331 in vop_revoke (ap=0xd7c9ef04) at /usr/src/sys/kern/vfs_subr.c:1965 #7 0xc01aace9 in vop_defaultop (ap=0xd7c9ef04) at /usr/src/sys/kern/vfs_default.c:150 #8 0xc0178381 in exit1 (p=0xd7a9e4e0, rv=0) at vnode_if.h:500 #9 0xc01780e4 in exit1 (p=0xd7a9e4e0, rv=0) at /usr/src/sys/kern/kern_exit.c:103 #10 0xc02a3f7d in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = -1, tf_ebp = -1077939936, tf_isp = -674631724, tf_ebx = 672471396, tf_edx = 672470976, tf_ecx = 1, tf_eax = 1, tf_trapno = 7, tf_err = 2, tf_eip = 672154536, tf_cs = 31, tf_eflags = 647, tf_esp = -1077939980, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1167 #11 0xc0297f05 in Xint0x80_syscall () Cannot access memory at address 0xbfbff120. The final problem before the crash is a reference to page 0 in vop_revoke because dev is 0. The vop_revoke_args struct (ap) appears to be filled in, but v_type is VBAD and a_vp->v_un.vu_spec.vu_specinfo (which is assigned to dev in vop_revoke) is 0. Here is the whole data structure: (kgdb) p *((struct vop_revoke_args *) 0xd7c9ef04)->a_desc $1 = {vdesc_offset = 47, vdesc_name = 0xc02bff86 "vop_revoke", vdesc_flags = 0, vdesc_vp_offsets = 0xc0300664, vdesc_vpp_offset = -1, vdesc_cred_offset = -1, vdesc_proc_offset = -1, vdesc_componentname_offset = -1, vdesc_transports = 0x0} (kgdb) p *((struct vop_revoke_args *) 0xd7c9ef04)->a_vp $2 = {v_flag = 8, v_usecount = 1, v_writecount = 0, v_holdcnt = 0, v_id = 18538, v_mount = 0x0, v_op = 0xc1de6500, v_freelist = { tqe_next = 0x0, tqe_prev = 0xd660a29c}, v_nmntvnodes = { tqe_next = 0x0, tqe_prev = 0xd7c59824}, v_cleanblkhd = { tqh_first = 0x0, tqh_last = 0xd7cacb6c}, v_dirtyblkhd = { tqh_first = 0x0, tqh_last = 0xd7cacb74}, v_synclist = { le_next = 0x0, le_prev = 0x0}, v_numoutput = 0, v_type = VBAD, v_un = {vu_mountedhere = 0x0, vu_socket = 0x0, vu_spec = { vu_specinfo = 0x0, vu_specnext = {sle_next = 0x0}}, vu_fifoinfo = 0x0}, v_lease = 0x0, v_lastw = 0, v_cstart = 0, v_lasta = 0, v_clen = 0, v_object = 0x0, v_interlock = { lock_data = 0}, v_vnlock = 0x0, v_tag = VT_NON, v_data = 0x0, v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0, tqh_last = 0xd7cacbc0}, v_dd = 0xd7cacb40, v_ddid = 0, v_pollinfo = {vpi_lock = {lock_data = 0}, vpi_selinfo = {si_pid = 0, si_note = {slh_first = 0x0}, si_flags = 0}, vpi_events = 0, vpi_revents = 0}, v_vxproc = 0x0} Any suggestions? This seems to be triggered when ange-ftp mode in emacs is left sitting for hours without either it or emacs running. I assume exit() is called for the ftp process because the remote side hung up. Just having the remote side hang up isn't enough to trigger it though. I often have the remote side hang up after a few minutes and ange-ftp reconnects. This seems to happen only after the process sits around. The machine is relatively idle, too. The probability that any swap is involved is quite low. -Michael To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5.1.1.6.0.20020711210231.00ace8c0>