From owner-freebsd-current@FreeBSD.ORG Thu May 1 20:25:49 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9F3E3106566B for ; Thu, 1 May 2008 20:25:49 +0000 (UTC) (envelope-from sam@freebsd.org) Received: from ebb.errno.com (ebb.errno.com [69.12.149.25]) by mx1.freebsd.org (Postfix) with ESMTP id 45DC18FC1A for ; Thu, 1 May 2008 20:25:48 +0000 (UTC) (envelope-from sam@freebsd.org) Received: from trouble.errno.com (trouble.errno.com [10.0.0.248]) (authenticated bits=0) by ebb.errno.com (8.13.6/8.12.6) with ESMTP id m41KPiOA066810 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 1 May 2008 13:25:46 -0700 (PDT) (envelope-from sam@freebsd.org) Message-ID: <481A2748.7080801@freebsd.org> Date: Thu, 01 May 2008 13:25:44 -0700 From: Sam Leffler Organization: FreeBSD Project User-Agent: Thunderbird 2.0.0.9 (X11/20071125) MIME-Version: 1.0 To: Fabian Keil References: <6b8e8f4f0804291900v521cde5cw1ad4eaba70244e9c@mail.gmail.com> <4817E52F.5070806@freebsd.org> <20080430121014.209beb00@fabiankeil.de> <48189577.4080109@freebsd.org> <48189871.2060005@freebsd.org> <20080501163943.4e8d102e@fabiankeil.de> <4819E367.3060306@freebsd.org> <20080501201135.4ff07fb4@fabiankeil.de> In-Reply-To: <20080501201135.4ff07fb4@fabiankeil.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-DCC-Rhyolite-Metrics: ebb.errno.com; whitelist Cc: freebsd-current@freebsd.org Subject: Re: Connection problems with wme enabled X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 May 2008 20:25:49 -0000 Fabian Keil wrote: > Sam Leffler wrote: > > >> Fabian Keil wrote: >> >>> Sam Leffler wrote: >>> > > >>> dmesg doesn't show any relevant messages, >>> even when booted in verbose mode. >>> >>> The ifconfig output looks normal (to me) as well: >>> >>> fk@TP51 ~ $sudo ifconfig -v wlan0 >>> wlan0: flags=8843 metric 0 mtu 1500 >>> ether 00:0e:... >>> inet 192.168.0.49 netmask 0xffffff00 broadcast 192.168.0.255 >>> media: IEEE 802.11 Wireless Ethernet OFDM/54Mbps mode 11g >>> status: associated >>> ssid ... channel 7 (2442 Mhz 11g) bssid 00:14:... >>> regdomain DEBUG country DE anywhere -ecm authmode OPEN -wps -tsn >>> privacy ON deftxkey 1 >>> wepkey 1:104-bit powersavemode OFF powersavesleep 100 txpower 30 >>> txpowmax 50.0 -dotd rtsthreshold 2346 fragthreshold 2346 bmiss 24 >>> 11b ucast NONE mgmt 1 Mb/s mcast 1 Mb/s maxretry 6 >>> 11g ucast NONE mgmt 1 Mb/s mcast 1 Mb/s maxretry 6 >>> 11na ucast NONE mgmt 0 MCS mcast 0 MCS maxretry 6 >>> 11ng ucast NONE mgmt 0 MCS mcast 0 MCS maxretry 6 >>> scanvalid 60 -bgscan bgscanintvl 300 bgscanidle 250 >>> roam:11b rssi 7dBm rate 1 Mb/s >>> roam:11g rssi 7dBm rate 5 Mb/s -pureg protmode CTS -ht >>> -htcompat -ampdu ampdulimit 8k ampdudensity - -amsdu -shortgi >>> htprotmode RTSCTS -puren wme -burst -ff -dturbo -dwds roaming AUTO >>> bintval 100 >>> AC_BE cwmin 4 cwmax 10 aifs 3 txopLimit 0 -acm ack >>> cwmin 4 cwmax 10 aifs 3 txopLimit 0 -acm >>> AC_BK cwmin 4 cwmax 10 aifs 7 txopLimit 0 -acm ack >>> cwmin 4 cwmax 10 aifs 7 txopLimit 0 -acm >>> AC_VI cwmin 3 cwmax 4 aifs 2 txopLimit 94 -acm ack >>> cwmin 3 cwmax 4 aifs 2 txopLimit 94 -acm >>> AC_VO cwmin 2 cwmax 3 aifs 2 txopLimit 47 -acm ack >>> cwmin 2 cwmax 3 aifs 2 txopLimit 47 -acm >>> groups: wlan >>> >>> While it shows association, open connections stall and >>> I can't create new ones until reviving the device with >>> ifconfig wlan0 down up. >>> >>> Under load (100K download rate) and with wme enabled >>> the problem occurs after less than 5 seconds, if there's >>> less load, it'll work a bit longer. >>> >>> wlanstats while the device is unresponsive: >>> >>> fk@TP51 ~ $wlanstats >>> 1 rx from wrong bssid >>> 4756 rx discard 'cuz dup >>> 33 rx discard 'cuz mcast echo >>> 6 rx discard mgt frames >>> 471 rx beacon frames >>> 6 rx element unknown >>> 390 rx frame chan mismatch >>> 8 rx disassociation >>> 8 beacon miss events handled >>> 23 rx discard 'cuz port unauthorized >>> 25 active scans started >>> 123844 wep crypto done in s/w >>> 934 rx management frames >>> 24 tx failed 'cuz vap not in RUN state >>> 165 total data frames received >>> 160 unicast data frames received >>> 5 multicast data frames received >>> 355 total data frames transmit >>> 355 unicast data frames sent >>> 54M current transmit rate >>> 42 current rssi >>> 42 current signal (dBm) >>> >>> >>> >> "8 beacon miss events handled"--so the firmware said you lost signal. >> >> >>> While the number of "chan mismatch" seems high, >>> I get the impression that it only increases while >>> the device is getting down and up. It doesn't seem >>> to increase while the device is working or hanging. >>> >>> wlanstats a bit later with wme disabled and wlan0 working: >>> >>> fk@TP51 ~ $wlanstats >>> 1 rx from wrong bssid >>> 4891 rx discard 'cuz dup >>> 33 rx discard 'cuz mcast echo >>> 6 rx discard mgt frames >>> 519 rx beacon frames >>> 6 rx element unknown >>> 453 rx frame chan mismatch >>> 8 rx disassociation >>> 8 beacon miss events handled >>> 23 rx discard 'cuz port unauthorized >>> 27 active scans started >>> 130514 wep crypto done in s/w >>> 1048 rx management frames >>> 25 tx failed 'cuz vap not in RUN state >>> 3318 total data frames received >>> 3318 unicast data frames received >>> 2829 total data frames transmit >>> 2829 unicast data frames sent >>> 36M current transmit rate >>> 42 current rssi >>> 42 current signal (dBm) >>> >>> >> wlanstats 1 gives you a rolling display every second; that's usually >> more helpful in understanding what's happening. Unfortunately there are >> more stats than can fit on a rolling display so sometimes the one(s) you >> want aren't shown. There is a column fmt mechanism a la ps to control >> output but it's not well developed (someone please take and improve). >> Also some stats are maintained by drivers and not yet counted in the >> net80211 layer (again, folks are welcome to help). >> > > While working: > > fk@TP51 ~ $wlanstats 1 > input 2short rx_ucast bvers wrbss rxdup mecho wrdir > 14 0 14 0 1 29861 54 0 > 19 0 19 0 0 0 0 0 > 7 0 7 0 0 0 0 0 > 15 0 15 0 0 0 0 0 > 14 0 14 0 0 0 0 0 > 4 0 4 0 0 0 0 0 > 1 0 1 0 0 0 0 0 > 1 0 1 0 0 0 0 0 > 1 0 1 0 0 0 0 0 > 3 0 3 0 0 0 0 0 > 2 0 2 0 0 0 0 0 > 2 0 2 0 0 0 0 0 > 2 0 2 0 0 0 0 0 > 3 0 3 0 0 0 0 0 > 2 0 2 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 3 0 3 0 0 0 0 0 > 2 0 2 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 1 0 1 0 0 0 0 0 > 1 0 1 0 0 0 0 0 > ^C > > While "hanging" ... > > fk@TP51 ~ $wlanstats 1 > input 2short rx_ucast bvers wrbss rxdup mecho wrdir > 882 0 831 0 1 29859 50 0 > 1 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 1 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 1 0 0 0 0 0 0 0 > 1 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 1 0 0 0 0 0 0 0 > 1 0 0 0 0 0 0 0 > 1 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 1 0 0 0 0 0 0 0 > 1 0 0 0 0 0 0 0 > 1 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > ^C > > Your code is out of date, I just imported some fixes yesterday :) >>> It's interesting that with wme enabled the hangs >>> usually occur with the transmit rate at 54, while >>> it's usually a lot lower with wme disabled and the >>> device working. >>> > > >> iwi does tx rate control in the firmware so unlikely to be related. The >> more likely issue is the beacon miss handling. The driver should >> recover and reconnect but it sounds like something didn't work. As a >> workaround you can try upping the bmiss count to see if this is a >> problem w/ the radio going deaf for a period of time--something I've >> seen on older Intel parts. >> > > Increasing bmiss to 250 (or decreasing it to 10) > doesn't seem to affect the problem. > Well if your beacon interval is 100 TU then the default setting of 24 means you didn't see a beacon frame in 2400 TU (~2.4 seconds) which is a really long time even if the channel is way busy. The firmware handles this notification so it could be a firmware issue; if I were investigating I'd sniff packets to see. I've tested bmiss handling before (yesterday even) and it worked for me w/ and w/o wme enabled so not sure what to say. What I have noticed is the firmware some times delivers a slew of beacon miss notifications immediately after associating to an ap. I have some ideas why this might occur but Intel wouldn't answer when asked. However if you're seeing bmiss after lots of traffic has passed then it's unclear what's happening. I tested mostly with a 2915 card fwiw. > >>> There are several access points in my neighbourhood, >>> mine doesn't always have the strongest signal: >>> >>> fk@TP51 ~ $ifconfig wlan0 scan >>> SSID BSSID CHAN RATE S:N INT CAPS >>> ... 00:18:... 11 54M 21:0 100 EPS >>> my ap 00:14:... 7 54M 21:0 100 EPS WME >>> ... 00:15:... 6 54M 14:0 100 EPB WPA >>> ... 00:04:... 6 54M 19:0 100 EP WPA WME >>> >>> I can't reproduce the problem with ath0. >>> >>> I'll be glad to provide further information, just tell me what you need. >>> > > >> See above. I ran tests yesterday w/ wme enabled in my environment but >> the signal was stronger so not an equivalent test. What you need to do >> is get a log that captures the event of losing connectivity. This must >> include net80211 logging and probably needs to include some level of >> driver debugging as the problem is in the driver. Try: >> >> wlandebug state+scan+auth+assoc >> > > fk@TP51 ~ $sudo wlandebug state+scan+auth+assoc > wlandebug: sysctl-get(net.wlan.0.debug): No such file or directory > > fk@TP51 ~ $sysctl net.wlan > net.wlan.addba_maxtries: 3 > net.wlan.addba_backoff: 10000 > net.wlan.addba_timeout: 250 > net.wlan.cac_timeout: 60 > net.wlan.nol_timeout: 1800 > net.wlan.recv_bar: 1 > net.wlan.0.%parent: iwi0 > net.wlan.0.driver_caps: 92307968 > net.wlan.0.bmiss_max: 200 (increased by me, without noticeable effect) > net.wlan.0.inact_run: 300 > net.wlan.0.inact_probe: 30 > net.wlan.0.inact_auth: 180 > net.wlan.0.inact_init: 30 > > >> sysctl debug.iwi=5 >> > > I'm not sure how useful it is without net80211 logging, > but I uploaded 160K of iwi0 messages at: > > http://www.fabiankeil.de/tmp/freebsd/iwi0-messages.txt > > During the "hangs" the device seems to be > sending more often than it does receive. > > Fabian > Looks like I failed to include IEEE80211_DEBUG in the default kernel configs; you'll need that to get wlan debug msgs. I'll try to look at your log later. Sam