From owner-freebsd-stable@FreeBSD.ORG Tue Jan 9 21:12:28 2007 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id ECCFC16A565; Tue, 9 Jan 2007 21:12:28 +0000 (UTC) (envelope-from sven@dmv.com) Received: from smtp-gw-cl-c.dmv.com (smtp-gw-cl-c.dmv.com [216.240.97.41]) by mx1.freebsd.org (Postfix) with ESMTP id A499213C442; Tue, 9 Jan 2007 21:12:28 +0000 (UTC) (envelope-from sven@dmv.com) Received: from mail-gw-cl-b.dmv.com (mail-gw-cl-b.dmv.com [216.240.97.39]) by smtp-gw-cl-c.dmv.com (8.12.10/8.12.10) with ESMTP id l09LCR0F022382; Tue, 9 Jan 2007 16:12:27 -0500 (EST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-b.dmv.com (8.12.9/8.12.9) with ESMTP id l09LCP8s051708; Tue, 9 Jan 2007 16:12:25 -0500 (EST) (envelope-from sven@dmv.com) From: Sven Willenberger To: John Baldwin In-Reply-To: <200701091409.29828.jhb@freebsd.org> References: <1168211205.22629.6.camel@lanshark.dmv.com> <200701091150.15274.jhb@freebsd.org> <1168365209.29047.23.camel@lanshark.dmv.com> <200701091409.29828.jhb@freebsd.org> Content-Type: text/plain Date: Tue, 09 Jan 2007 16:18:38 -0500 Message-Id: <1168377518.29047.27.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.39 Cc: stable@freebsd.org, Bruce Evans , freebsd-amd64@freebsd.org Subject: Re: Panic in 6.2-PRERELEASE with bge on amd64 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jan 2007 21:12:29 -0000 On Tue, 2007-01-09 at 14:09 -0500, John Baldwin wrote: > On Tuesday 09 January 2007 12:53, Sven Willenberger wrote: > > On Tue, 2007-01-09 at 11:50 -0500, John Baldwin wrote: > > > On Tuesday 09 January 2007 09:37, Sven Willenberger wrote: > > > > On Tue, 2007-01-09 at 12:50 +1100, Bruce Evans wrote: > > > > > On Mon, 8 Jan 2007, Sven Willenberger wrote: > > > > > > > > > > > On Mon, 2007-01-08 at 16:06 +1100, Bruce Evans wrote: > > > > > >> On Sun, 7 Jan 2007, Sven Willenberger wrote: > > > > > > > > > > >>> The short and dirty of the dump: > > > > > >>> ... > > > > > >>> --- trap 0xc, rip = 0xffffffff801d5f17, rsp = 0xffffffffb371ab50, > rbp > > > = 0xffffffffb371aba0 --- > > > > > >>> bge_rxeof() at bge_rxeof+0x3b7 > > > > > >> > > > > > >> What is the instruction here? > > > > > > > > > > > > I will do my best to ferret out the information you need. For the > > > > > > bge_rxeof() at bge_rxeof+0x3b7 line, the instruction is: > > > > > > > > > > > > 0xffffffff801d5f17 : mov %r15,0x28(%r14) > > > > > > ... > > > > > >> Looks like a null pointer panic anyway. I guess the instruction is > > > > > >> movl to/from 0x28(%reg) where %reg is a null pointer. > > > > > >> > > > > > > > > > > > > from the above lines, apparently %r14 is null then. > > > > > > > > > > Yes. It's a bit suprising that the access is a write. > > > > > > > > > > >>> ... > > > > > >>> #8 0xffffffff801db818 in bge_intr (xsc=0x0) > > > at /usr/src/sys/dev/bge/if_bge.c:2707 > > > > > >> > > > > > >> What is the statement here? It presumably follow a null pointer > and > > > only > > > > > >> the exprssion for the pointer is interesting. xsc is already null > but > > > > > >> that is probably a bug in gdb, or the result of excessive > optimization. > > > > > >> Compiling kernels with -O2 has little effect except to break > debugging. > > > > > > > > > > > > the block of code from if_bge.c: > > > > > > > > > > > > 2705 if (ifp->if_drv_flags & IFF_DRV_RUNNING) { > > > > > > 2706 /* Check RX return ring producer/consumer. */ > > > > > > 2707 bge_rxeof(sc); > > > > > > 2708 > > > > > > 2709 /* Check TX ring producer/consumer. */ > > > > > > 2710 bge_txeof(sc); > > > > > > 2711 } > > > > > > > > > > Oops. I should have asked for the statment in bge_rxeof(). > > > > > > > > #7 0xffffffff801d5f17 in bge_rxeof (sc=0xffffffff8836b000) > > > at /usr/src/sys/dev/bge/if_bge.c:2528 > > > > 2528 m->m_pkthdr.len = m->m_len = cur_rx->bge_len - > > > ETHER_CRC_LEN; > > > > > > > > (where m is defined as: > > > > 2449 struct mbuf *m = NULL; > > > > ) > > > > > > It's assigned earlier in between those two places. Can you 'p rxidx' as > well > > > as 'p sc->bge_cdata.bge_rx_std_chain[rxidx]' and 'p > > > sc->bge_cdata.bge_rx_jumbo_chain[rxidx]'? Also, are you using jumbo > frames > > > at all? > > > > > > > (kgdb) p rxidx > > $1 = 499 > > (kgdb) p sc->bge_cdata.bge_rx_std_chain[rxidx] > > $2 = (struct mbuf *) 0xffffff0097a27900 > > (kgdb) p sc->bge_cdata.bge_rx_jumbo_chain[rxidx] > > $3 = (struct mbuf *) 0x0 > > > > And no, I am not using jumbo frames: > > bge0: flags=8843 mtu 1500 > > options=1b > > Did you do a 'p m' to verify that m is NULL? If you can reproduce this, I'd > add some KASSERT's where it fetches the mbuf out of the descriptor data to > see if m is NULL. > at this spot, m is null: (kgdb) p m $3 = (struct mbuf *) 0x0 As far as adding some KASSERT's ... you have gone beyond my rudimentary knowledge here as far as application goes.