From owner-freebsd-current@FreeBSD.ORG Thu Dec 29 01:18:37 2005 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 16A6616A41F for ; Thu, 29 Dec 2005 01:18:37 +0000 (GMT) (envelope-from sam@errno.com) Received: from ebb.errno.com (ebb.errno.com [69.12.149.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2372643D4C for ; Thu, 29 Dec 2005 01:18:36 +0000 (GMT) (envelope-from sam@errno.com) Received: from [10.0.0.248] (trouble.errno.com [10.0.0.248]) (authenticated bits=0) by ebb.errno.com (8.12.9/8.12.6) with ESMTP id jBT1IXiE067224 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 28 Dec 2005 17:18:35 -0800 (PST) (envelope-from sam@errno.com) Message-ID: <43B339B7.6070809@errno.com> Date: Wed, 28 Dec 2005 17:19:51 -0800 From: Sam Leffler User-Agent: Mozilla Thunderbird 1.0.7 (X11/20051207) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andrea Campi References: <20051221191919.GB17950@webcom.it> <43A9B5EA.8030905@errno.com> <20051221215457.GE17950@webcom.it> <43AA4415.8070300@errno.com> <20051228102512.GV1779@webcom.it> In-Reply-To: <20051228102512.GV1779@webcom.it> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: current@freebsd.org Subject: Re: Panic and LOR on -CURRENT with ath X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Dec 2005 01:18:37 -0000 Andrea Campi wrote: > On Wed, Dec 21, 2005 at 10:13:41PM -0800, Sam Leffler wrote: > >>The info you want is gone by the time the crash happens. Last time I >>chased a similar problem I did some private hacks to write-protect mbufs >>to catch unexpected modification. You might try removing ipfw or using >>an alternate packet filter if that's feasible. I wouldn't be surprised >>if this is related to ipfw and/or divert sockets. > > > OK, I'm running with pf right now, and that particular panic went away. > > However, I had a few others... the first one is admittedly old, and it > might have disappeared with the last cvsup (dec 27): > > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0xdeadc0de > fault code = supervisor read, page not present > instruction pointer = 0x20:0xc05288e6 > stack pointer = 0x28:0xc5b70910 > frame pointer = 0x28:0xc5b7091c > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 270 (natd) > [thread pid 270 tid 100043 ] > Stopped at ieee80211_find_txnode+0x36: testb $0x1,0(%eax) > db> bt > Tracing pid 270 tid 100043 td 0xc0d5e300 > ieee80211_find_txnode(c0d681ac,deadc0de,c0d690f8,c0d681ac,c0d71ef4) at ieee80211 > _find_txnode+0x36 > ath_start(c0d65000,25000,be,0,c1040842) at ath_start+0xe52 > ether_output_frame(c0d65000,c1040800,c1040848,0,0) at ether_output_frame+0x226 > ether_output(c0d65000,c1040800,c5b70a80,c0fdda50) at ether_output+0x2d3 > ip_output(c1040800,0,c5b70a7c,1,0,0) at ip_output+0xa7e > ip_forward(c06376e0,0,c05ef2e7,6d9,c06376e0) at ip_forward+0x120 > ip_input(c1040800) at ip_input+0x8d5 > div_send(c0fda000,0,c1040800,c0d91850,0) at div_send+0x18b > sosend(c0fda000,c0d91850,c5b70c40,c1040800,0,0,c0d5e300) at sosend+0x5c5 > kern_sendit(c0d5e300,3,c5b70cbc,0,0) at kern_sendit+0xbe > sendit(c5b70cbc,0,bfbdeca0,0,c0d91850) at sendit+0x41 > sendto(c0d5e300,c5b70d04,6,f65,296) at sendto+0x47 > syscall(3b,3b,bfbf003b,1,b0) at syscall+0x110 > Xint0x80_syscall() at Xint0x80_syscall+0x1f Another mbuf free'd out form under you (0xdeadc0de param to find_txnode is the mac address in the 802.11 header take from the mbuf contents). > > > This one happened as I restarted dhclient: > > > panic: bus_dmamap_load_mbuf_sg: no mbuf packet header! > KDB: stack backtrace: > panic(c06050ba,c060205c,c0554ad9,c1225846,c5708bfa) at panic+0xef > bus_dmamap_load_mbuf_sg(c0d21d80,0,c1225800,c0d733f0,c0d733d0,1) at bus_dmamap_l > oad_mbuf_sg+0x4ec > ath_start(c0d65000,6c,c0ce771c,0,c0614d47) at ath_start+0x26f > taskqueue_run(c0ce7700,0,c0d0e624,0,c04b84c0) at taskqueue_run+0x81 > ithread_loop(c0ce7680,c5708d38,c0ce7680,c04b84c0,0) at ithread_loop+0x175 > fork_exit(c04b84c0,c0ce7680,c5708d38) at fork_exit+0x83 > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip = 0, esp = 0xc5708d6c, ebp = 0 --- > KDB: enter: panic > [thread pid 31 tid 100017 ] Probably the same thing; every mbuf chain has a packet header because the 802.11 encapsulation guarantees it. > > > This is 100% reproducible: switching from 11b to 11g (with a few associated > nodes, most of which 11b only) results in this panic: > > > gw0# ifconfig ath0 mode 11g > panic: bogus long slot station count 0 This is a bug; I'll merge the correct code from my p4 tree. Until then mark the ap down before changing the mode and bringing it back up. > KDB: stack backtrace: > panic(c061e2a3,0,c0d699a8,c0f1d000,b59) at panic+0xef > ieee80211_node_join(c0d691ac,c0f1d000,c0d699ac,0,c061dd08) at ieee80211_node_joi > n > ieee80211_iterate_nodes(c0d699a8,c0557ce0,c0d691ac) at ieee80211_iterate_nodes+0 > xbc > ieee80211_newstate(c0d691ac,0,ffffffff) at ieee80211_newstate+0x4e6 > ath_newstate(c0d691ac,0,ffffffff,c0d6b000,c0d69000) at ath_newstate+0x2e4 > ath_stop_locked(c0d69d3c,8,c06089ec,356,c0d69d3c) at ath_stop_locked+0xab > ath_init(c0d69000,20280,c5e20a60,c0539d66,c0d65000) at ath_init+0x4e > ath_media_change(c0d65000,c066a358,c5e20a64,c0cefd80,c0d65000) at ath_media_chan > ge+0x3e > ifmedia_ioctl(c0d65000,c0d92ca0,c0d69aac,c0206937,0) at ifmedia_ioctl+0x1f6 > ieee80211_ioctl(c0d691ac,c0206937,c0d92ca0) at ieee80211_ioctl+0x16a > ath_ioctl(c0d65000,c0206937,c0d92ca0) at ath_ioctl+0xb2 > ifhwioctl(c0d92ca0,c1110a80,c0d65000,c0fdf6f4,c0206937) at ifhwioctl+0x113 > ifioctl(c0fdf6f4,c0206937,c0d92ca0,c1110a80) at ifioctl+0x5c > soo_ioctl(c0deca68,c0206937,c0d92ca0,c0ded780,c1110a80) at soo_ioctl+0x29d > ioctl(c1110a80,c5e20d04,3,c,246) at ioctl+0xfa > syscall(3b,3b,3b,8057d40,0) at syscall+0x110 > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x2813f71f, esp = 0xbfbfe4cc, ebp = 0xbfbfe4e8 --- > KDB: enter: panic > [thread pid 1615 tid 100068 ] > Stopped at kdb_enter+0x2c: leal 0(%esi),%esi > > > I'll keep testing but things look nicer despite these panics (as they only > happen during non routine things). The same problem appears to be around. Sam