Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Jan 2007 14:09:29 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        Sven Willenberger <sven@dmv.com>
Cc:        stable@freebsd.org, freebsd-amd64@freebsd.org
Subject:   Re: Panic in 6.2-PRERELEASE with bge on amd64
Message-ID:  <200701091409.29828.jhb@freebsd.org>
In-Reply-To: <1168365209.29047.23.camel@lanshark.dmv.com>
References:  <1168211205.22629.6.camel@lanshark.dmv.com> <200701091150.15274.jhb@freebsd.org> <1168365209.29047.23.camel@lanshark.dmv.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 09 January 2007 12:53, Sven Willenberger wrote:
> On Tue, 2007-01-09 at 11:50 -0500, John Baldwin wrote:
> > On Tuesday 09 January 2007 09:37, Sven Willenberger wrote:
> > > On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote:
> > > > On Mon, 8 Jan 2007, Sven Willenberger wrote:
> > > > 
> > > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote:
> > > > >> On Sun, 7 Jan 2007, Sven Willenberger wrote:
> > > > 
> > > > >>> The short and dirty of the dump:
> > > > >>> ...
> > > > >>> --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, 
rbp 
> > = 0xffffffffb371aba0 ---
> > > > >>> bge_rxeof() at bge_rxeof+0x3b7
> > > > >>
> > > > >> What is the instruction here?
> > > > >
> > > > > I will do my best to ferret out the information you need. For the
> > > > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is:
> > > > >
> > > > > 0xffffffff801d5f17 <bge_rxeof+951>:     mov    %r15,0x28(%r14)
> > > > > ...
> > > > >> Looks like a null pointer panic anyway.  I guess the instruction is
> > > > >> movl to/from 0x28(%reg) where %reg is a null pointer.
> > > > >>
> > > > >
> > > > > from the above lines, apparently %r14 is null then.
> > > > 
> > > > Yes.  It's a bit suprising that the access is a write.
> > > > 
> > > > >>> ...
> > > > >>> #8  0xffffffff801db818 in bge_intr (xsc=0x0) 
> > at /usr/src/sys/dev/bge/if_bge.c:2707
> > > > >>
> > > > >> What is the statement here?  It presumably follow a null pointer 
and 
> > only
> > > > >> the exprssion for the pointer is interesting.  xsc is already null 
but
> > > > >> that is probably a bug in gdb, or the result of excessive 
optimization.
> > > > >> Compiling kernels with -O2 has little effect except to break 
debugging.
> > > > >
> > > > > the block of code from if_bge.c:
> > > > >
> > > > >   2705         if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
> > > > >   2706                 /* Check RX return ring producer/consumer. */
> > > > >   2707                 bge_rxeof(sc);
> > > > >   2708
> > > > >   2709                 /* Check TX ring producer/consumer. */
> > > > >   2710                 bge_txeof(sc);
> > > > >   2711         }
> > > > 
> > > > Oops.  I should have asked for the statment in bge_rxeof().
> > > 
> > > #7  0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) 
> > at /usr/src/sys/dev/bge/if_bge.c:2528
> > > 2528                    m->m_pkthdr.len = m->m_len = cur_rx->bge_len - 
> > ETHER_CRC_LEN;
> > > 
> > > (where m is defined as:
> > > 2449                 struct mbuf             *m = NULL;
> > > )
> > 
> > It's assigned earlier in between those two places.  Can you 'p rxidx' as 
well 
> > as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p 
> > sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'?  Also, are you using jumbo 
frames 
> > at all? 
> > 
> 
> (kgdb) p rxidx
> $1 = 499
> (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx]
> $2 = (struct mbuf *) 0xffffff0097a27900
> (kgdb) p sc->bge_cdata.bge_rx_jumbo_chain[rxidx]
> $3 = (struct mbuf *) 0x0
> 
> And no, I am not using jumbo frames:
> bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>         options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING>

Did you do a 'p m' to verify that m is NULL?  If you can reproduce this, I'd 
add some KASSERT's where it fetches the mbuf out of the descriptor data to 
see if m is NULL.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200701091409.29828.jhb>