From owner-freebsd-current@FreeBSD.ORG  Thu Dec 29 01:18:37 2005
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
X-Original-To: current@freebsd.org
Delivered-To: freebsd-current@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 16A6616A41F
	for <current@freebsd.org>; Thu, 29 Dec 2005 01:18:37 +0000 (GMT)
	(envelope-from sam@errno.com)
Received: from ebb.errno.com (ebb.errno.com [69.12.149.25])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 2372643D4C
	for <current@freebsd.org>; Thu, 29 Dec 2005 01:18:36 +0000 (GMT)
	(envelope-from sam@errno.com)
Received: from [10.0.0.248] (trouble.errno.com [10.0.0.248])
	(authenticated bits=0)
	by ebb.errno.com (8.12.9/8.12.6) with ESMTP id jBT1IXiE067224
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 28 Dec 2005 17:18:35 -0800 (PST) (envelope-from sam@errno.com)
Message-ID: <43B339B7.6070809@errno.com>
Date: Wed, 28 Dec 2005 17:19:51 -0800
From: Sam Leffler <sam@errno.com>
User-Agent: Mozilla Thunderbird 1.0.7 (X11/20051207)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Andrea Campi <andrea+freebsd_current@webcom.it>
References: <20051221191919.GB17950@webcom.it> <43A9B5EA.8030905@errno.com>
	<20051221215457.GE17950@webcom.it> <43AA4415.8070300@errno.com>
	<20051228102512.GV1779@webcom.it>
In-Reply-To: <20051228102512.GV1779@webcom.it>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: current@freebsd.org
Subject: Re: Panic and LOR on -CURRENT with ath
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Dec 2005 01:18:37 -0000

Andrea Campi wrote:
> On Wed, Dec 21, 2005 at 10:13:41PM -0800, Sam Leffler wrote:
> 
>>The info you want is gone by the time the crash happens.  Last time I 
>>chased a similar problem I did some private hacks to write-protect mbufs 
>>to catch unexpected modification.  You might try removing ipfw or using 
>>an alternate packet filter if that's feasible.  I wouldn't be surprised 
>>if this is related to ipfw and/or divert sockets.
> 
> 
> OK, I'm running with pf right now, and that particular panic went away.
> 
> However, I had a few others... the first one is admittedly old, and it
> might have disappeared with the last cvsup (dec 27):
> 
> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0xdeadc0de
> fault code              = supervisor read, page not present
> instruction pointer     = 0x20:0xc05288e6
> stack pointer           = 0x28:0xc5b70910
> frame pointer           = 0x28:0xc5b7091c
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 270 (natd)
> [thread pid 270 tid 100043 ]
> Stopped at      ieee80211_find_txnode+0x36:     testb   $0x1,0(%eax)
> db> bt
> Tracing pid 270 tid 100043 td 0xc0d5e300
> ieee80211_find_txnode(c0d681ac,deadc0de,c0d690f8,c0d681ac,c0d71ef4) at ieee80211
> _find_txnode+0x36
> ath_start(c0d65000,25000,be,0,c1040842) at ath_start+0xe52
> ether_output_frame(c0d65000,c1040800,c1040848,0,0) at ether_output_frame+0x226
> ether_output(c0d65000,c1040800,c5b70a80,c0fdda50) at ether_output+0x2d3
> ip_output(c1040800,0,c5b70a7c,1,0,0) at ip_output+0xa7e
> ip_forward(c06376e0,0,c05ef2e7,6d9,c06376e0) at ip_forward+0x120
> ip_input(c1040800) at ip_input+0x8d5
> div_send(c0fda000,0,c1040800,c0d91850,0) at div_send+0x18b
> sosend(c0fda000,c0d91850,c5b70c40,c1040800,0,0,c0d5e300) at sosend+0x5c5
> kern_sendit(c0d5e300,3,c5b70cbc,0,0) at kern_sendit+0xbe
> sendit(c5b70cbc,0,bfbdeca0,0,c0d91850) at sendit+0x41
> sendto(c0d5e300,c5b70d04,6,f65,296) at sendto+0x47
> syscall(3b,3b,bfbf003b,1,b0) at syscall+0x110
> Xint0x80_syscall() at Xint0x80_syscall+0x1f

Another mbuf free'd out form under you (0xdeadc0de param to find_txnode 
is the mac address in the 802.11 header take from the mbuf contents).

> 
> 
> This one happened as I restarted dhclient:
> 
> 
> panic: bus_dmamap_load_mbuf_sg: no mbuf packet header!
> KDB: stack backtrace:
> panic(c06050ba,c060205c,c0554ad9,c1225846,c5708bfa) at panic+0xef
> bus_dmamap_load_mbuf_sg(c0d21d80,0,c1225800,c0d733f0,c0d733d0,1) at bus_dmamap_l
> oad_mbuf_sg+0x4ec
> ath_start(c0d65000,6c,c0ce771c,0,c0614d47) at ath_start+0x26f
> taskqueue_run(c0ce7700,0,c0d0e624,0,c04b84c0) at taskqueue_run+0x81
> ithread_loop(c0ce7680,c5708d38,c0ce7680,c04b84c0,0) at ithread_loop+0x175
> fork_exit(c04b84c0,c0ce7680,c5708d38) at fork_exit+0x83
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xc5708d6c, ebp = 0 ---
> KDB: enter: panic
> [thread pid 31 tid 100017 ]

Probably the same thing; every mbuf chain has a packet header because 
the 802.11 encapsulation guarantees it.

> 
> 
> This is 100% reproducible: switching from 11b to 11g (with a few associated
> nodes, most of which 11b only) results in this panic:
> 
> 
> gw0# ifconfig ath0 mode 11g
> panic: bogus long slot station count 0

This is a bug; I'll merge the correct code from my p4 tree.  Until then 
mark the ap down before changing the mode and bringing it back up.

> KDB: stack backtrace:
> panic(c061e2a3,0,c0d699a8,c0f1d000,b59) at panic+0xef
> ieee80211_node_join(c0d691ac,c0f1d000,c0d699ac,0,c061dd08) at ieee80211_node_joi
> n
> ieee80211_iterate_nodes(c0d699a8,c0557ce0,c0d691ac) at ieee80211_iterate_nodes+0
> xbc
> ieee80211_newstate(c0d691ac,0,ffffffff) at ieee80211_newstate+0x4e6
> ath_newstate(c0d691ac,0,ffffffff,c0d6b000,c0d69000) at ath_newstate+0x2e4
> ath_stop_locked(c0d69d3c,8,c06089ec,356,c0d69d3c) at ath_stop_locked+0xab
> ath_init(c0d69000,20280,c5e20a60,c0539d66,c0d65000) at ath_init+0x4e
> ath_media_change(c0d65000,c066a358,c5e20a64,c0cefd80,c0d65000) at ath_media_chan
> ge+0x3e
> ifmedia_ioctl(c0d65000,c0d92ca0,c0d69aac,c0206937,0) at ifmedia_ioctl+0x1f6
> ieee80211_ioctl(c0d691ac,c0206937,c0d92ca0) at ieee80211_ioctl+0x16a
> ath_ioctl(c0d65000,c0206937,c0d92ca0) at ath_ioctl+0xb2
> ifhwioctl(c0d92ca0,c1110a80,c0d65000,c0fdf6f4,c0206937) at ifhwioctl+0x113
> ifioctl(c0fdf6f4,c0206937,c0d92ca0,c1110a80) at ifioctl+0x5c
> soo_ioctl(c0deca68,c0206937,c0d92ca0,c0ded780,c1110a80) at soo_ioctl+0x29d
> ioctl(c1110a80,c5e20d04,3,c,246) at ioctl+0xfa
> syscall(3b,3b,3b,8057d40,0) at syscall+0x110
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x2813f71f, esp = 0xbfbfe4cc, ebp = 0xbfbfe4e8 ---
> KDB: enter: panic
> [thread pid 1615 tid 100068 ]
> Stopped at      kdb_enter+0x2c: leal    0(%esi),%esi
> 
> 
> I'll keep testing but things look nicer despite these panics (as they only
> happen during non routine things).

The same problem appears to be around.

	Sam