From owner-freebsd-stable@FreeBSD.ORG Wed Jun 14 16:32:00 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4834116A4DA for ; Wed, 14 Jun 2006 16:32:00 +0000 (UTC) (envelope-from sam@errno.com) Received: from ebb.errno.com (ebb.errno.com [69.12.149.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id 43F7D43D60 for ; Wed, 14 Jun 2006 16:31:54 +0000 (GMT) (envelope-from sam@errno.com) Received: from [10.0.0.248] (trouble.errno.com [10.0.0.248]) (authenticated bits=0) by ebb.errno.com (8.13.6/8.12.6) with ESMTP id k5EGVoEJ016936 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 14 Jun 2006 09:31:52 -0700 (PDT) (envelope-from sam@errno.com) Message-ID: <449039F6.1000703@errno.com> Date: Wed, 14 Jun 2006 09:31:50 -0700 From: Sam Leffler User-Agent: Thunderbird 1.5.0.2 (X11/20060508) MIME-Version: 1.0 To: Reid Linnemann References: <448F36CB.6000604@cs.okstate.edu> <448F597C.4010108@errno.com> <44901249.30208@cs.okstate.edu> In-Reply-To: <44901249.30208@cs.okstate.edu> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: atheros 'device timeout' X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Jun 2006 16:32:00 -0000 Reid Linnemann wrote: > As was spoken by Sam Leffler on 06/13/06 19:34~ >> Reid Linnemann wrote: >>> I have a DWL-G520 with an atheros chipset on a 6-stable machine and >>> I've had a persistent problem for quite a long time now from >>> 6-CURRENT in 2005 to recent 6-stable that I need to address. >>> >>> First, I'll spit out pertinent machine information - >>> uname -a: >>> FreeBSD hautlos 6.0-STABLE FreeBSD 6.0-STABLE #3: Sat Mar 11 16:26:58 >>> CST 2006 root@hautlos:/usr/obj/usr/src/sys/HAUTLOS i386 >>> >>> pciconf for the card: >>> ath0@pci0:12:0: class=0x020000 card=0x3a131186 chip=0x0013168c >>> rev=0x01 hdr=0x00 >>> vendor = 'Atheros Communications Inc.' >>> device = 'AR5212, AR5213 802.11a/b/g Wireless Adapter' >>> class = network >>> >>> ifconfig ath0: >>> ath0: >>> flags=28943 >>> mtu 1500 >>> inet6 fe80::211:95ff:fe8d:1379%ath0 prefixlen 64 scopeid 0x3 >>> ether 00:11:95:8d:13:79 >>> media: IEEE 802.11 Wireless Ethernet autoselect mode 11g >>> >>> status: associated >>> ssid deutschland channel 6 bssid 00:11:95:8d:13:79 >>> authmode OPEN privacy OFF txpowmax 36 protmode CTS ssid HIDE >>> dtimperiod 1 bintval 100 >>> >>> The ath0 is briged to a dc0 interface through netgraph. >>> >>> >>> This is the behavior I've noticed; at seemingly random intervals I >>> will get a device timeout thrown out to dmesg from the ath driver. >>> Reading the ath manpage, I see that "This should not happen." Most >>> occurrences do not yield any perceivable change in the connection. >>> However, sometimes the windows wireless clients I have associated >>> with this machine will not be able to receive anything over the >>> wireless link unless they trigger it by sending something over the >>> radio. Sometimes the windows clients will straight up lose track of >>> the machine completely and scan for other networks. I've not been >>> able to replicate the problem from a FreeBSD client. >>> >>> I desperately need to troubleshoot the problem, as my wife is getting >>> frustrated with losing connectivity to my "stupid computer" (which is >>> the gateway for our network) and I can't cook very well. ;) The >>> difficulty I'm having is that I don't know where to start to solve >>> this particular issue, and I've seen no other users with the same >>> problem. I'd appreciate any bumps in the correct direction. >> >> There is a known issue w/ the buffering of multicast frames for >> associated stations operating in power save mode. If this is the >> cause and you can disable power save operation in the clients you can >> workaround the problem. >> >> Sam > > Thanks Sam, I've disabled power save mode in both wireless windows > clients and I'll see if the problem lightens up. Also, do you know why > device timeouts would be spat out by the driver when no stations are > associated with the AP? The device timeouts persisted after my clients > were shut down, and no other stations appear to be in the area. No idea. The problem with buffered mcast frames is because the h/w xmit queue for the frames stops running and blocks the lower priority queues causing the watchdog timer to fire (and generate the device timeout msg). I'm pretty sure this is a race between ath_tx_start and ath_beacon_proc but I've not had time to rework the code and test (this problem does not exist in the linux version but it's structured very differently). If no clients are associated (or associated w/ power save enabled) then no frames should be buffered and this problem should not occur. To debug you can enable reset msgs in the driver (athdebug reset) and look to see what h/w q the frame(s) were on when the reset was done. Note that to do that you must enable ATH_DEBUG. Sam