Date: Sat, 31 Oct 2009 14:21:07 -0700 From: Pyun YongHyeon <pyunyh@gmail.com> To: Norbert Papke <npapke@acm.org> Cc: freebsd-stable@freebsd.org Subject: Re: 7.2 Stable Crash - possibly related to if_re Message-ID: <20091031212107.GC17243@michelle.cdnetworks.com> In-Reply-To: <200910301823.51274.npapke@acm.org> References: <200910292156.19845.npapke@acm.org> <20091030165451.GA17243@michelle.cdnetworks.com> <200910301823.51274.npapke@acm.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Fri, Oct 30, 2009 at 06:23:51PM -0700, Norbert Papke wrote: > On October 30, 2009, Pyun YongHyeon wrote: > > On Thu, Oct 29, 2009 at 09:56:19PM -0700, Norbert Papke wrote: > > > This occurred shortly after "scp"ing from a VirtualBox VM to the host. > > > The file transfer got stuck. The "re" interface stopped working. > > > Shortly afterwards, the host crashed. The "re" interface was used by the > > > host, the guest was using a different NIC in bridged mode. > > > > > > > > > FreeBSD proven.lan 7.2-STABLE FreeBSD 7.2-STABLE #5 r198666: Thu Oct 29 > > > 18:36:57 PDT 2009 > > > > > > Fatal trap 12: page fault while in kernel mode > > > cpuid = 0; apic id = 00 > > > fault virtual address = 0x18 > > > > It looks like a NULL pointer dereference, possibly mbuf related > > one. > > > > > fault code = supervisor write data, page not present > > > instruction pointer = 0x8:0xffffffff80d476ee > > > stack pointer = 0x10:0xffffff8000078ae0 > > > frame pointer = 0x10:0xffffff8000078b40 > > > code segment = base 0x0, limit 0xfffff, type 0x1b > > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > > processor eflags = interrupt enabled, resume, IOPL = 0 > > > current process = 18 (swi5: +) > > > Physical memory: 8177 MB > > > > > > > > > > #9 0xffffffff807e710e in calltrap () > > > at /usr/public/freebsd/sources/stable/sys/amd64/amd64/exception.S:218 > > > #10 0xffffffff80d476ee in re_rxeof () from /boot/kernel/if_re.ko > > > > Hmm, I think there is a missing information here. Not sure where it > > dereferenced a NULL pointer in re_rxeof(). > > >> #11 0xffffffff80d4a481 in re_int_task (arg=Variable "arg" is not available. > >> ) > >> > at /usr/public/freebsd/sources/stable/sys/modules/re/../../dev/re/if_re.c:2191 > > I am not sure how much I trust frame 10. The instruction > at "0xffffffff80d476ee" is the one after the "retq" from re_rxeof(). Frame > 11 seems OK to me. The "struct rl_softc*", in particular, looks plausible > but I don't know enough to say for sure. > > > Because that this is the > > first report for NULL pointer dereference in Rx handler I need more > > information how to reproduce it with minimal configuration. Can you > > also reproduce the issues without virtual box? > > I am trying but no luck so far. > > > By chance, did you stop the re0 interface with ifconfig when you > > noticed the file transfer got stuck? > > It is possible. I had it happen twice. The first time I definitely tried > to "down" re. I cannot recall what I did the second time. The crash dump is > from the second time. > Ok, then would you try attached patch? --2oS5YaxWCcQjTEyO Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="re.rxeof.patch" Index: sys/dev/re/if_re.c =================================================================== --- sys/dev/re/if_re.c (revision 198686) +++ sys/dev/re/if_re.c (working copy) @@ -1817,6 +1817,8 @@ for (i = sc->rl_ldata.rl_rx_prodidx; maxpkt > 0; i = RL_RX_DESC_NXT(sc, i)) { + if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) + break; cur_rx = &sc->rl_ldata.rl_rx_list[i]; rxstat = le32toh(cur_rx->rl_cmdstat); if ((rxstat & RL_RDESC_STAT_OWN) != 0) --2oS5YaxWCcQjTEyO--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091031212107.GC17243>